DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Amendment
This Office Action is in response to applicant’s communication filed 17 June 2022, in response to the Office Action mailed 3 March 2022.  The applicant’s remarks and any amendments to the claims or specification have been considered, with the results that follow.

Examiner’s Note: the remarks mention an amendment to the title and filing of a terminal disclaimer, but neither of these appear to have been filed.


Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: WEIGHT QUANTIZATION IN NEURAL NETWORKS.


Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.


Claim Objections
Claim 1 is objected to because of the following informalities: “multiple weight in a class” appears as though it should be “multiple weights in a class”.  Appropriate correction is required.
Claims 2-7 depend upon claim 1, and thus include the aforementioned limitation(s).

Claim 8 is objected to because of the following informalities: “multiple weight in a class” appears as though it should be “multiple weights in a class”.  Appropriate correction is required.
Claims 9-19 depend upon claim 8, and thus include the aforementioned limitation(s).

Claim 20 is objected to because of the following informalities: “multiple weight in a class” appears as though it should be “multiple weights in a class”.  Appropriate correction is required.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-6, 8-18, and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-3, 6, 7, 10-14, and 16-18 of U.S. Patent No. 10,657,439 in view of Ambardekar (US Provisional Application No. 62/486432 – published as US 2018/0300603 – the provisional application is cited in the rejections below).  As the rejections are maintained from the prior action, only the amended elements have been highlighted below.

As per claim 1, the claim is compared with claim 7 of U.S. Patent No. 10,657,439 —whereas the differences between the instant application and the copending application have been highlighted—as follows: 
Instant Application
U.S. Patent No. 10,657,439
wherein the quantized weight is a center weight of multiple weight in a class
wherein the lookup table unit includes at least one lookup table selected from a group consisting of: a multiplication lookup table that includes one or more multiplication results, wherein each of the one or more multiplication results respectively corresponds to a central weight and a central neuron, wherein the central weight corresponds to one of the weight indexes Examiner’s Note: here the central weight corresponding to the weight index is the center weight of the weight in the class



As per claim 8, the claim is compared with claim 13 of U.S. Patent No. 10,657,439 —whereas the differences between the instant application and the copending application have been highlighted—as follows: 
Instant Application
U.S. Patent No. 10,657,439
wherein the quantized weight is a center weight of multiple weight in a class
wherein the operation results include a result of at least one of the following operations: addition, multiplication, and pooling, where pooling includes average pooling, maximum pooling, and median pooling Examiner’s Note: here the median pooling corresponds to the central (median) weight 



As per claim 20, see the rejection of claim 1, above.


Claims 7 and 19 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 6, 10, 16, and 17 of U.S. Patent No. 10,657,439 in view of Ambardekar (US Provisional Application No. 62/486432 – published as US 2018/0300603 – the provisional application is cited in the rejections below) and further in view of Deisher (US 2014/0300758).  The rejections are maintained from the prior action.


Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 13-15 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. 

Claims 13-15 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.
The factors to be considered are as follows:
(A) the breadth of the claims;
(B) the nature of the invention;
(C) the state of the prior art;
(D) the level of one of ordinary skill;
(E) the level of predictability in the art;
(F) the amount of direction provided by the inventor;
(G) the existence of working examples;
(H) the quantity of experimentation needed to make or use the invention based on the content of the disclosure.

Claim 13 fail(s) to comply with the enablement requirement, as the specification does not describe how the claimed subject matter is achieved.  Namely, the specification does not describe how a single neural network dedicated instruction includes an operation instruction that performs ALL of an arithmetic operation including a matrix operation instruction, a vector operation instruction, a scalar operation instruction, a CNN operation instruction, a fully connected NN operation instruction, a pooling NN operation instruction, a RBM operation instruction, a LRN operation instruction, a LCN operation instruction, a LSTM operation instruction, a RNN operation instruction, a RELU operation instruction, a PRELU operation instruction, a SIGMOID operation instruction, a TANH operation instruction, and a MXOUT operation instruction), and/or a logical instruction that performs both a logical operation including a vector logical operation instruction and a scalar logical operation instruction; all in the same instruction.  Regarding the above-identified factors to be considered, claim 13 is addressed as follows:
Regarding factor (A), the breadth of the claim is fairly narrow as to the type of operations involved, although it does not specify specific operations (add, branch, load, etc.); this weighs against enablement.
Regarding factor (B), the nature of the invention is related to a neural network accelerator including specific units and instructions; this does not weight toward or against enablement.
Regarding factor (C), the prior art, including that cited, includes various neural network accelerators, including specialized instructions and units; this weighs toward enablement.
Regarding factor (D), the level of ordinary skill in the art would be familiar with neural network accelerators; this weighs toward enablement.
Regarding factor (E), the level of predictability in the art includes deterministic outputs, but varying levels of performance depending on the dataset; this weighs toward enablement
Regarding factor (F), the inventor has provided a description in the specification that is essentially similar in scope as that given in the claim (see, e.g., pgs. 19, 20, and 26 of the specification); this weighs against enablement.
Regarding factor (G), the inventor has not provided any working examples of a single instruction containing all of the listed instructions/operations; this weighs against enablement
Regarding factor (H), as the inventor has not provided any working examples, the quantity of experimentation needed to make or use the claimed invention is necessarily high; this weighs against enablement.
Considering these factors, and the evidence as a whole, the claim fails to comply with the enablement requirement.
Claim 9 depends upon claim 13, and thus includes the aforementioned limitation(s).

Claim 14 fail(s) to comply with the enablement requirement, as the specification does not describe how the claimed subject matter is achieved.  Namely, the specification does not describe how a single neural network dedication instruction includes at least a Cambricon instruction (composed of operation code and operand) and the Cambricon instruction includes at least one of: a Cambricon control instruction including BOTH a jump instruction and a conditional branch instruction, a Cambricon data transfer instruction including ALL of a loading instruction, a storage instruction and a moving instruction, a Cambricon operation instruction including ALL of a Cambricon matrix instruction, a Cambricon vector instruction, and a Cambricon scalar instruction – the matrix instruction including a matrix-vector multiplication, a matrix-add-matrix operation, and a matrix-subract-matrix operation; the vector instruction including vector elementary operation, a vector transcendental function operation, a dot product operation, a random vector generation operation, and an operation of max/min of a vector; and the scalar instruction includes a scalar elementary operation and a scalar transcendental function, and/or a Cambricon logical instruction including ALL of a Cambricon vector logical instruction and a Cambricon scalar instruction – the vector logical instruction including a vector comparing operation and a vector logical operation, the logical operation including AND, OR, and NOT, and the scalar logical instruction including scalar comparing and scalar logical operations; where all of the listed operations are performed by a single instruction.  Regarding the above-identified factors to be considered, claim 14 is addressed as follows:
Regarding factor (A), the breadth of the claim is fairly narrow as to the type of operations involved; this weighs against enablement.
Regarding factor (B), the nature of the invention is related to a neural network accelerator including specific units and instructions; this does not weight toward or against enablement.
Regarding factor (C), the prior art, including that cited, includes various neural network accelerators, including specialized instructions and units; this weighs toward enablement.
Regarding factor (D), the level of ordinary skill in the art would be familiar with neural network accelerators; this weighs toward enablement.
Regarding factor (E), the level of predictability in the art includes deterministic outputs, but varying levels of performance depending on the dataset; this weighs toward enablement
Regarding factor (F), the inventor has provided a description in the specification that essentially similar in scope as that given in the claim (see, e.g., pgs. 20-21 and 26-28 of the specification); this weighs against enablement.
Regarding factor (G), the inventor has not provided any working examples of a single instruction containing all of the listed instructions; this weighs against enablement
Regarding factor (H), as the inventor has not provided any working examples, the quantity of experimentation needed to make or use the claimed invention is necessarily high; this weighs against enablement.
Considering these factors, and the evidence as a whole, the claim fails to comply with the enablement requirement.

Claim 15 depends upon claim 14, and thus includes the aforementioned limitation(s).
Examiner’s Note: Claim 15 also further specifies the operation elements of the instructions of claim 14, that themselves each comprise multiple operations (which are not enabled as a single operation), including: the vector elementary operation, the vector transcendental operation, the scalar elementary operation, the scalar transcendental function, the vector comparing operation, the vector logical operation, the scalar comparing operation, and the scalar logical operation.  The Wands factors, as evaluated in the manner described above regarding claim 14, apply to both the instructions and the operations.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1, 2, 5, 6, 8, 9, 12, 13, 16-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ambardekar (US Provisional Application No. 62/486432 – published as US 2018/0300603 – the provisional application is cited in the rejections below) in view of Ahalt et al. (Competitive Learning Algorithms for Vector Quantization, 1990, pgs. 277-290).

As per claim 1, Ambardekar teaches a processing device, comprising: a control unit configured to receive an instruction and decode the instruction to generate search and control information and operation control information [a pipelined processor fetches and executes instructions using a controller and causing data transfer operations or data operand operations (pg. 2 regarding instruction and data transfers/fetching; see also pgs. 8-9 and 11-12 regarding execution of instructions by a controller); where determining the appropriate operation to execute for the instruction is decode]; a lookup table unit configured to receive the search and control information, a weight dictionary, and a weight codebook, and perform a table lookup operation on the weight dictionary and the weight codebook to obtain a quantized weight according to the search and control information [instructions are executed to perform neural network operations (pg. 2, see also pgs. 11-12, 16, etc.) including using fast weight lookup hardware (FWLH) to convert weight indices to rows and a fast weight lookup table (FWLT) (the weight dictionary and weight codebook), to store the quantized weights, where during the execution of a neuron compute operation requiring a weight value, an index is used to reference a specific vector of weight values in a VQ table in the FWLT that are used for computations (pgs. 34-36 including the figure showing the neural processor and the table of codebook depths, etc.)]; and an operation unit configured to receive the operation control information and an input neuron, perform an operation on the quantized weight and the input neuron according to the operation control information to obtain an output neuron, and output the output neuron [processing units for performing the neuron operations on the quantized weights and input data and producing an output (pgs. 16-17, see also pgs. 9, 26, etc.)].
While Ambardekar teaches performing vector quantization on the weights to create a codebook (see above) it does not teach the quantization process and thus does not explicitly teach wherein the quantized weight is a center weight of multiple weight in a class.
Ahalt teaches wherein the quantized weight is a center weight of multiple weight in a class [vector quantization is performed by a network to create a weight codebook (pg. 278, section II.A; etc.) where the quantized weight vector holds the center weight of the data cluster (pgs. 281-283, section III.B; etc.)].
Ambardekar and Ahalt are analogous art, as they are within the same field of endeavor, namely quantization of network parameters into a codebook.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the center weights of data clusters in the vector quantization of the weights to create the codebook, as taught by Ahalt, for the vector quantization of the weights to create the codebook in Ambardekar.
Ahalt provides motivation as [using the vector quantization encoding including the center weights achieves better compression performance while reducing the amount of computation necessary (pg. 277, section I; etc.) while applying any number of optimizations (pg. 279, section II; etc.)].

As per claim 2, Ambardekar/Ahalt teaches a pre-processing unit configured to pre-process external input information to obtain the input neuron, the weight dictionary, the weight codebook, and the instruction [the weights used for the neuron calculations by the instructions and stored in the FWLT and FWLH are first quantized, converting contiguous segments of weight values into vectors of an arbitrary length and assigning the vector an index (Ambardekar: pg. 34, etc.)]; a storage unit configured to store the input neuron, the weight dictionary, the weight codebook, and the instruction and receive the output neuron [instructions and data, including weight data, may be stored in a main/general or local memory (Ambardekar: pg. 4 for instructions and data in general, pg. 33 for weight data, etc.) where the quantized weights for the FWLT and FWLH may be in local memory or cache, or main/general memory (caching unit or storage unit) (Ambardekar: pgs. 33 and 35, etc.)]; a caching unit configured to cache the instruction, the input neuron, the output neuron, the weight dictionary, and the weight codebook [the neuron data and weights may be stored in cache (Ambardekar: pgs. 28-29, etc.) where the quantized weights for the FWLT and FWLH may be in local memory or cache, or main/general memory (caching unit or storage unit) (Ambardekar: pgs. 33 and 35, etc.)]; and a DMA (direct memory access) unit configured to read/write data or instructions between the storage unit and the caching unit [a DMA controls moves from one memory to another, including instructions and data (Ambardekar: pgs. 2-4, 48, etc.)].

As per claim 5, Ambardekar/Ahalt teaches wherein the storage unit is configured to store an unquantized weight, which is directly output to the operation unit [instructions and data, including weight data, may be stored in a main/general memory (Ambardekar: pg. 4 for instructions and data in general, pg. 33 for weight data, etc.)].

As per claim 6, Ambardekar/Ahalt teaches wherein the operation unit includes: a first operation part configured to multiply a weight and the input neuron; and/or a second operation part including one or a plurality of adders configured to add the weight and input neuron by one or a plurality of adders; and/or a third operation part configured to perform a nonlinear function on the weight and the input neuron, where the nonlinear function includes an active function, and the active function includes sigmoid, tanh, relu and/or softmax; and/or a fourth operation part configured to perform a pooling operation on the weight and the input neuron, where the pooling operation includes average pooling, maximum pooling, and/or median pooling, and the weight includes an unquantized weight and/or a quantized weight [the processor includes multiple computation units performing neuron operations including multiply, add, max, accumulate, etc. (Ambardekar: pg. 26, see also pgs. 9, 16-17, 46, etc.)].

As per claim 8, Ambardekar teaches a processing method, comprising: receiving an input neuron, a weight dictionary, a weight codebook and an instruction [a pipelined processor fetches instructions and data and executes instructions to perform neuron operations on neuron data (pg. 2 regarding instruction and data transfers/fetching; see also pgs. 8-9 and 11-12 regarding execution of instructions by a controller; pgs. 16-17 for neuron operations; etc.) while utilizing a fast weight lookup hardware (FWLH) to convert weight indices to rows and a fast weight lookup table (FWLT)) to store the quantized weights (the weight dictionary and weight codebook) (pgs. 34-36 including the figure showing the neural processor and the table of codebook depths, etc.)]; decoding the instruction to generate search and control information and operation control information [pipelined processor fetches and executes instructions using a controller and causing data transfer operations or data operand operations (pg. 2 regarding instruction and data transfers/fetching; see also pgs. 8-9 and 11-12 regarding execution of instructions by a controller); where determining the appropriate operation to execute for the instruction is decoding]; and looking up the weight dictionary and the weight codebook to obtain a quantized weight according to the search and control information [utilizing a fast weight lookup hardware (FWLH) to convert weight indices to rows and a fast weight lookup table (FWLT)) to store the quantized weights (the weight dictionary and weight codebook) (pgs. 34-36 including the figure showing the neural processor and the table of codebook depths, etc.)], and performing an operation on the quantized weight and the input neuron according to the operation control information to obtain an output neuron and output the output neuron [processing units for performing the neuron operations on the quantized weights and input data and producing an output (pgs. 16-17, see also pgs. 9, 26, etc.)].
While Ambardekar teaches performing vector quantization on the weights to create a codebook (see above) it does not teach the quantization process and thus does not explicitly teach wherein the quantized weight is a center weight of multiple weight in a class.
Ahalt teaches wherein the quantized weight is a center weight of multiple weight in a class [vector quantization is performed by a network to create a weight codebook (pg. 278, section II.A; etc.) where the quantized weight vector holds the center weight of the data cluster (pgs. 281-283, section III.B; etc.)].
Ambardekar and Ahalt are analogous art, as they are within the same field of endeavor, namely quantization of network parameters into a codebook.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the center weights of data clusters in the vector quantization of the weights to create the codebook, as taught by Ahalt, for the vector quantization of the weights to create the codebook in Ambardekar.
Ahalt provides motivation as [using the vector quantization encoding including the center weights achieves better compression performance while reducing the amount of computation necessary (pg. 277, section I; etc.) while applying any number of optimizations (pg. 279, section II; etc.)].

As per claim 9, Ambardekar/Ahalt teaches wherein before receiving the input neuron, the weight dictionary, the weight codebook and the instruction, the method further includes: pre-processing external input information to obtain the input neuron, the weight dictionary, the weight codebook, and the instruction [the weights used for the neuron calculations by the instructions and stored in the FWLT and FWLH are first quantized, converting contiguous segments of weight values into vectors of an arbitrary length and assigning the vector an index (Ambardekar: pg. 34, etc.)].

As per claim 12, Ambardekar/Ahalt teaches wherein the instruction is a neural network dedicated instruction [the processor includes multiple computation units performing neuron operations including multiply, add, max, accumulate, etc. (Ambardekar: pg. 26, see also pgs. 9, 16-17, 46, etc.)].

As per claim 13, Ambardekar/Ahalt teaches wherein the neural network dedicated instruction includes one of one or more instructions including: a control instruction configured to control an execution process of the neural network; a data transfer instruction configured to perform data transfer between different storage media, where a data format includes a matrix format, a vector format, and a scalar format; an operation instruction configured to perform an arithmetic operation on the neural network including a matrix operation instruction, a vector operation instruction, a scalar operation instruction, a convolutional neural network operation instruction, a fully connected neural network operation instruction, a pooling neural network operation instruction, a Restricted Boltzmann Machine (RBM) neural network operation instruction, a Local Response Normalization (LRN) neural network operation instruction, a Local Contrast Normalization (LCN) neural network operation instruction, a Long Short-Term Memory (LSTM) neural network operation instruction, a Recurrent Neural Networks (RNN) operation instruction, a Rectified Linear Unit (RELU) neural network operation instruction, a Parametric Rectified Linear Unit (PRELU) neural network operation instruction, a SIGMOID neural network operation instruction, a TANH neural network operation instruction and a MAXOUT neural network instruction; and a logical instruction configured to perform a neural network logical operation including a vector logical operation instruction and a scalar logical operation instruction [the processor multiple instructions including control and data transfer instructions, as well as instructions performing neuron operations including multiply, add, max, accumulate, etc. (Ambardekar: pgs. 9-10 for the control and data instructions from iterators, and see also pgs. 16-17, 26, 46, etc.), which at least includes a control instruction configured to control an execution process of the NN].

As per claim 16, Ambardekar/Ahalt teaches receiving an unquantized weight, and performing an operation on the unquantized weight and the input neuron to obtain an output neuron and output the output neuron [a pipelined processor fetches instructions and data and executes instructions to perform neuron operations on neuron data using the weights (Ambardekar: pg. 2 regarding instruction and data transfers/fetching; see also pgs. 8-9 and 11-12 regarding execution of instructions by a controller; pgs. 16-17 for neuron operations; etc.) where weight quantization may be enabled or disabled for specific NN layers (Ambardekar: pg. 37, etc.)].

As per claim 17, see the rejection of claim 6, above.

As per claim 18, Ambardekar/Ahalt teaches wherein one or a plurality of adders are configured to add the weight and input neuron [the processor includes multiple computation units performing neuron operations including multiply, add, max, accumulate, etc. (Ambardekar: pg. 26, see also pgs. 9, 16-17, 46, etc.) which can include using a MAC unit (Ambardekar: pg. 47, etc.)].

As per claim 20, see the rejection of claim 1, above, wherein Ambardekar/Ahalt also teaches an electronic device including the processing device [the neural network hardware system may be part of a system on chip including other systems or multiple NN’s (Ambardekar: pg. 32, see also pgs. 28, 37, 46, etc.)]. 


Claims 3, 4, 7, 10, 11, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ambardekar (US Provisional Application No. 62/486432 – published as US 2018/0300603 – the provisional application is cited in the rejections below) in view of Ahalt et al. (Competitive Learning Algorithms for Vector Quantization, 1990, pgs. 277-290), and further in view of Deisher (US 2014/0300758).

As per claim 3, Ambardekar/Ahalt teaches the processing device of claim 2, as described above.
While Ambardekar/Ahalt teaches pre-processing in the form of quantization (see above) it does not explicitly teach wherein pre-processing external input information includes: segmentation, Gaussian filter, binarization, regularization, and/or normalization.
Deisher teaches wherein pre-processing external input information includes: segmentation, Gaussian filter, binarization, regularization, and/or normalization [the neural network data may be preprocessed including normalization (para. 0037, etc.)].
Ambardekar/Ahalt and Deisher are analogous art, as they are within the same field of endeavor, namely neural network acceleration including quantization.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include the pre-processing taught by Deisher, in the pre-processing in the system taught by Ambardekar/Ahalt.
Deisher provides motivation as [a front-end unit may perform various types of pre-processing on the data for neural network acceleration to improve cost, size, and power usage, etc. (paras. 0036-37, etc.)].

As per claim 4, Ambardekar/Ahalt teaches wherein the caching unit includes: an input neuron caching unit configured to cache the input neuron; and an output neuron caching unit configured to cache the output neuron [the neuron data may be stored in cache (Ambardekar: pgs. 28-29, etc.)].
While Ambardekar/Ahalt teaches fetching instructions from a memory and storing various data in caches (see above) it does not explicitly teach an instruction caching unit configured to cache the instruction.
Deisher teaches an instruction caching unit configured to cache the instruction [the processor performing NN operations may include a separate instruction cache (para. 0266, etc.)].
Ambardekar/Ahalt and Deisher are analogous art, as they are within the same field of endeavor, namely neural network acceleration including quantization.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include the instruction cache taught by Deisher in the caches in the system taught by Ambardekar/Ahalt.
Because both Ambardekar/Ahalt and Deisher teach systems utilizing neural network accelerators to perform NN related instruction processing, it would have been obvious to one of ordinary skill in the art to include the instruction cache taught by Deisher in the caches in the system taught by Ambardekar, to achieve the predictable result of improving instruction fetch times.

As per claim 7, see the rejection of claim 19, below.

As per claim 10, see the rejection of claim 4, above.

As per claim 11, see the rejection of claim 3, above.

As per claim 19, Ambardekar/Ahalt teaches the processing method of claim 18, as described above.
While Ambardekar/Ahalt teaches using a MAC unit/adder (see above) it does not explicitly teach wherein a plurality of adders constitute an adder tree to realize an addition of the weight and input neuron step by step
Deisher teaches wherein a plurality of adders constitute an adder tree to realize an addition of the weight and input neuron step by step [the weighted inputs are then summed in an accumulator section of the MAC formed of an adder tree (para. 0081, etc.)].
Ambardekar/Ahalt and Deisher are analogous art, as they are within the same field of endeavor, namely neural network acceleration including quantization.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include the adder tree in the MAC, as taught by Deisher, in the MAC unit in the system taught by Ambardekar/Ahalt.
Deisher provides motivation as [the adder tree in the MAC may be used so that sufficient adder operations may be performed to realize an entire sum, or may be pipelined to maintain throughput if cycle time is too low (para. 0081, etc.)].


Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ambardekar (US Provisional Application No. 62/486432 – published as US 2018/0300603 – the provisional application is cited in the rejections below) in view of Ahalt et al. (Competitive Learning Algorithms for Vector Quantization, 1990, pgs. 277-290), and further in view of Liu et al. (Cambricon: An Instruction Set Architecture for Neural Networks, June 2016, pgs. 393-405).

As per claim 14, Ambardekar/Ahalt teaches the processing method of claim 12, as described above.
While Ambardekar/Ahalt teaches the neural network dedicated instructions (see above), it does not explicitly teach wherein the neural network dedicated instruction includes at least a Cambricon instruction composed of an operation code and an operand, and the Cambricon instruction includes one of one or more instructions including: a Cambricon control instruction, including a jump instruction and a conditional branch instruction, configured to control an execution process; a Cambricon data transfer instruction, including a loading instruction, a storage instruction, and a moving instruction, configured to transfer data between different storage media; where the loading instruction is configured to load data from a main memory to a cache; the storage instruction is configured to store data from the cache to the main memory; and the moving instruction is configured to move data from the cache to another cache or from the cache to a register or from the register to another register; a Cambricon operation instruction, including a Cambricon matrix operation instruction, a Cambricon vector operation instruction, and a Cambricon scalar operation instruction, configured to perform a neural network arithmetic operation; where the Cambricon matrix operation instruction is configured to complete a matrix operation in the neural network, and the Cambricon matrix operation includes a matrix-vector multiplication operation, a vector multiply matrix operation, a matrix multiply scalar operation, an outer product operation, a matrix-add-matrix operation, and a matrix-subtract-matrix operation; the Cambricon vector operation instruction is configured to complete a vector operation in the neural network, and the Cambricon vector operation includes a vector elementary operation, a vector transcendental function operation, a dot product operation, a random vector generation operation, and an operation of maximum/minimum of a vector; and the Cambricon scalar operation instruction is configured to complete a scalar operation in the neural network, and the Cambricon scalar operation includes a scalar elementary operation and a scalar transcendental function; and a Cambricon logical instruction, including a Cambricon vector logical operation instruction and a Cambricon scalar logical operation instruction, configured for the logical operation of the neural network; where the Cambricon vector logical operation instruction is configured for a vector comparing operation and a vector logical operation, the vector logical operation includes AND, OR, and NOT, and the Cambricon scalar logical operation instruction is configured for a scalar comparing operation and a scalar logical operation.
Liu teaches wherein the neural network dedicated instruction includes at least a Cambricon instruction composed of an operation code and an operand [an instruction set architecture of dedicated neural network instructions called Cambricon (pg. 393, abstract; etc.) where the Cambricon instructions include opcodes and operands (pgs. 395-397, figures 1-6; etc.)], and the Cambricon instruction includes one of one or more instructions including: a Cambricon control instruction, including a jump instruction and a conditional branch instruction, configured to control an execution process [a Cambricon instruction type Control, which includes jumps and conditional branch (pg. 395, Table I and associated description)]; a Cambricon data transfer instruction, including a loading instruction, a storage instruction, and a moving instruction, configured to transfer data between different storage media; where the loading instruction is configured to load data from a main memory to a cache; the storage instruction is configured to store data from the cache to the main memory; and the moving instruction is configured to move data from the cache to another cache or from the cache to a register or from the register to another register [a Cambricon instruction type Data Transfer which includes load/store/move for matrix, vector, and scalar data types (pg. 395, Table I and associated description)]; a Cambricon operation instruction, including a Cambricon matrix operation instruction, a Cambricon vector operation instruction, and a Cambricon scalar operation instruction, configured to perform a neural network arithmetic operation; where the Cambricon matrix operation instruction is configured to complete a matrix operation in the neural network, and the Cambricon matrix operation includes a matrix-vector multiplication operation, a vector multiply matrix operation, a matrix multiply scalar operation, an outer product operation, a matrix-add-matrix operation, and a matrix-subtract-matrix operation; the Cambricon vector operation instruction is configured to complete a vector operation in the neural network, and the Cambricon vector operation includes a vector elementary operation, a vector transcendental function operation, a dot product operation, a random vector generation operation, and an operation of maximum/minimum of a vector; and the Cambricon scalar operation instruction is configured to complete a scalar operation in the neural network, and the Cambricon scalar operation includes a scalar elementary operation and a scalar transcendental function [a Cambricon instruction type Computational which includes arithmetic operations (including multiply, outer product, add, subtract, divide, transcendental functions, random, max/min, etc.) on matrix, vector, and scalar data types (pg. 395, Table I and associated description)]; and a Cambricon logical instruction, including a Cambricon vector logical operation instruction and a Cambricon scalar logical operation instruction, configured for the logical operation of the neural network; where the Cambricon vector logical operation instruction is configured for a vector comparing operation and a vector logical operation, the vector logical operation includes AND, OR, and NOT, and the Cambricon scalar logical operation instruction is configured for a scalar comparing operation and a scalar logical operation [a Cambricon instruction type Logical which includes logical operations (comparisons, AND/OR/NOT, etc.) on vector and scalar data (pg. 395, Table I and associated description)].
Ambardekar/Ahalt and Liu are analogous art, as they are within the same field of endeavor, namely including neural network accelerators and specific instructions.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include the Cambricon instructions for dedicated neural network instructions, as taught by Liu, in the dedicated neural network instructions taught by Ambardekar/Ahalt.
Liu provides motivation as [Cambricon provides succinct, flexible, lightweight, and efficient instructions for neural networks to efficiently support large, variable data widths (pg. 394, sections I, I.A; pgs. 398-403, sections IV-VIII; etc.)].


Response to Arguments
Applicant's arguments filed 17 June 2022 have been fully considered but they are not persuasive.

As noted above, while the remarks mention an amendment to the title, no such amendment appears to have been filed.  Therefore, the objection to the specification has been maintained.

As noted above, while the remarks mention a Terminal Disclaimer, none appears to have been filed.  Therefore, the double patenting rejections have been maintained 

The objections to claims 7, 9-11, and 16 have been withdrawn due to the amendments filed.

The rejections of claims 7, and 9-11 under 35 U.S.C. 112(a) have been withdrawn due to the amendments filed.

As noted in the remarks, applicant has amended claims 13 and 14 such that a single instruction includes one of the listed instructions.
However, the listed instructions and operations still include instructions/operations which themselves include multiple instructions/operations which are not enabled as a single instruction/operation, as described above (e.g., the “operation instruction” of claim 13 includes a whole list of operation instructions within a single instruction, while the “scalar elementary operation” of claim 15 includes a whole list of operations)

The rejections of claims 6, 7, and 17-19 under 35 U.S.C. 112(b) have been withdrawn due to the amendments filed.

Applicant’s arguments with respect to the rejections under 35 U.S.C. 102 and 103 have been considered but are moot because the new grounds of rejection do not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-20 are rejected.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Lane et al. (Squeezing Deep Learning into Mobile and Embedded Devices, July 2017, pgs. 82-88) and Kim (US 2015/0332690) – disclose systems using a weight codebook and dictionary.
Chu et al. (Vector Quantization of Neural Networks, Nov. 1998, pgs. 1235-1245) – discloses using different models for vector quantization to create a codebook of weights using a centroid of a cell.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEORGE GIROUX/Primary Examiner, Art Unit 2128