DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-21 are presented for examination.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on October 29, 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to because (a) the last instance of reference character “607B” in Fig. 6A should be “607N”; (b) reference character “1410N” (Fig. 14A) is in the drawings but not the specification; (c) memory “1444B” (Fig. 14B) should be changed to “1434B” for consistency with the specification and global scheduler should have a reference character other than “1434” to avoid confusion with the memories; (d) in Fig. 20, reference character “20020” should be “2006” for consistency with the specification; (e) Figs. 21-22 and 26A-26B have text on a shaded background, see 37 CFR § 1.84(p)(3); (f) in Fig. 23, the subscript in “m1” crosses and mingles with the lines, see id.; (g) Fig. 24 is unlabeled; and (h) in specification paragraph 218, the second instance of “processing block 2510” should be “processing block 2520” for consistency with the drawings.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
Examiner objects to the specification for various informalities.  Examiner has attached a marked-up copy of the specification indicating where it appears errors have occurred.  To the extent that these markings are not self-explanatory and not corrected, Examiner will enumerate the remaining objections in a subsequent Office Action.
The use of the terms BLUETOOTH (paragraphs 41, 202-03), WI-FI (paragraph 41), LINUX (paragraph 126), MICROSOFT (paragraphs 109, 126), which are trade names or marks used in commerce, has been noted in this application. The terms should be accompanied by the generic terminology; furthermore, the terms should be capitalized wherever they appear or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the terms.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) is permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Objections
Examiner objects to claims 7, 9-14, and 20.
Claims 7, 13, and 20 are objected to because of the following informalities:  “the has the first value” should be “the sign has the first value”.  
Claim 9 is objected to because of the following informalities: the claim appears to vacillate between method-like language and system-like language (“[a] method … comprising: accelerator circuitry, including: executing …; storing …; and performing….”).  (Emphasis added.)  While it is reasonably clear that Applicant intended to draft a method claim, the switching of language casts some degree of doubt as to which statutory category was intended.  Examiner recommends that Applicant amend the claim to read “[a] method … comprising: executing, by accelerator circuitry, …; storing, by the accelerator circuitry …; and performing, by the accelerator circuitry …”.
All claims dependent on a claim objected to hereunder are also objected to for being dependent on an objected-to base claim.
Appropriate correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 9, and 15-16 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Falcón et al. (EP 3035203) (“Falcón”).
Regarding claim 1, Falcón discloses “[a]n apparatus to facilitate execution of non-linear functions operations comprising accelerator circuitry, including: 
a compute grid having a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations (in an apparatus for performing convolution operations for a neural network, the apparatus comprises a plurality of processing units (PUs) – Falcón, abstract; in a fully connected 1-to-1 operation, one logical neuron is mapped to one physical neuron computed in a single PU, and input neurons and weights are fetched for every dot product operation [neural network computation] – id. at paragraph 62; any partial result that needs to be stored will proceed before a maximum time that is fixed and known [suggesting that the results of computation are stored] – id. at paragraph 75; see also Fig. 14A (showing the PUs)), and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data (an activation function [non-linear function] unit includes a polymorphic decoder, a lookup table (LUT), and a piecewise interpolation approximation unit; the polymorphic decoder maps each input X [stored value] to a range in an abscissa space, and parameters stored in the LUT for a given linear segment are used by the piecewise interpolation approximation unit to compute a final result – Falcón, paragraphs 125-27; see also Fig. 36).”

Claim 9 is a method claim corresponding to apparatus claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 2, Falcón discloses “one or more lookup tables (LUTs) to store a plurality of non-linear parameter values for the PWL approximations (in the activation function unit, a LUT unit is a component where parameters of linear interpolation segments are stored; parameters stored in the LUT are used by the piecewise interpolation approximation unit to compute the final result --  Falcón, paragraph 127).”  

Regarding claim 3, Falcón discloses that “each of the plurality of processing elements comprises: 
a first stage including a multiplier to perform a multiplication operation of an input data value and a first interval version1 of a first parameter value during a first processing cycle of a PWL approximation (after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]); and 
a second stage including an accumulator to perform a subtraction operation of a first interval version of a second parameter from a result of the multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results]).”

Regarding claim 15, Falcón discloses “[a]n accelerator comprising:  
79one or more memory devices to store one or more one or more lookup tables (LUTs) (in the activation function unit, a LUT unit [memory device] is a component where parameters of linear interpolation segments are stored; parameters stored in the LUT are used by the piecewise interpolation approximation unit to compute the final result --  Falcón, paragraph 127); and 
a compute grid having a plurality of a plurality of tiles, each including a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations (in an apparatus for performing convolution operations for a neural network, the apparatus comprises a plurality of processing units (PUs) – Falcón, abstract; in a fully connected 1-to-1 operation, one logical neuron is mapped to one physical neuron computed in a single PU, and input neurons and weights are fetched for every dot product operation [neural network computation] – id. at paragraph 62; any partial result that needs to be stored will proceed before a maximum time that is fixed and known [suggesting that the results of computation are stored] – id. at paragraph 75; see also Fig. 14A (showing the PUs)), and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data and non-linear parameter values stored in the one or more LUTs (in the activation function (AF) [non-linear function] unit, a LUT unit is a component where parameters of linear interpolation segments [non-linear parameter values] are stored; parameters stored in the LUT are used by the piecewise interpolation approximation unit to compute the final result --  Falcón, paragraph 127; AF unit may be incorporated into the processing unit of a neuromorphic accelerator – id. at paragraph 130; see also Fig. 36 [showing the input X [stored values] being input to the activation function unit]).”

Regarding claim 16, Falcón discloses that “each of the plurality of processing elements comprises: 
a first stage including a multiplier to perform a multiplication operation of an input data value and a first interval version of a first parameter value during a first processing cycle of a PWL approximation (after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]); and 
a second stage including an accumulator to perform a subtraction operation of a first interval version of a second parameter from a result of the multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results]).”

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 4, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Falcón in view of Pillai et al. (US 20190042922) (“Pillai”).
Regarding claim 4, Falcón, as modified by Pillai, discloses that “the accumulator sets an enable bit indicating a sign of the result of the subtraction operation (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).”  
Pillai and the instant application relate to piecewise linear approximation circuits for neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcón to set a bit indicating a sign of the result of accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 10, Falcón discloses that “performing the PWL approximations comprises performing a first processing cycle, including: 
performing a multiplication operation to multiply an input data value and a first interval version of a first parameter value (after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]);  [and]
78performing a subtraction operation of a first interval version of a second parameter from the result of the multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results])….” 
Pillai discloses “setting an enable bit indicating a sign of the result of the subtraction operation (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcón to set a bit indicating a sign of the result of accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 17, Falcón, as modified by Pillai, discloses that “the accumulator sets an enable bit indicating a sign of the result of the subtraction operation (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Falcón to set a bit indicating a sign of the result of accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Claims 5-8, 11-14, and 18-21 are rejected under 35 U.S.C. 103 as being unpatentable over Falcón in view of Pillai and further in view of Deisher et al. (US 20180121796) (“Deisher”).
Regarding claim 5, Falcón, as modified by Pillai and Deisher, discloses that “the accumulator performs an addition operation of a first interval version of a third parameter and the result of the subtraction operation during a second processing cycle of the PWL approximation (in an activation function unit of a neural network accelerator, after a multiply and accumulate step, the sum value is passed to another accumulator where a bias [third parameter] may be added to the neural network sum output [result of the subtraction operation]; the sum buffer may hold multiple sums, allowing the accumulation to generate an output every cycle despite the fact that the activation function may take several cycles to be performed – Deisher, paragraph 94 [i.e., the multiplication and accumulation may take place in one cycle and the addition of the bias term may take place in a second cycle]; see also paragraph 84 (disclosing that the MAC may require multiple clock cycles to receive an entire input vector over multiple input sets), Fig. 5; compare specification paragraph 213 (disclosing that the second-step addition is of bias term ci)).”  
Deisher and the instant application both all relate to physical implementations of piecewise linear approximators of activation functions of neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Pillai to add a third parameter to the result of the multiply-and-accumulate operations in a separate cycle, as disclosed by Deisher, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would add more flexibility to the system by not restricting the output to a product of weights and inputs and allowing an arbitrary constant to be added to the result.  See Deisher, paragraph 94.

Regarding claim 6, Falcón, as modified by Deisher and Pillai, discloses that “a determination is made as to whether the enable bit indicates that the sign of the result of the subtraction operation has a first value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [first value = value that corresponds to a positive input] – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to determine the value of the sign bit indicating the result of the accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 7, Falcón, as modified by Deisher and Pillai, discloses that “the multiplier performs a second multiplication operation of the input data value and a second interval version of the first parameter value during a third processing cycle of a PWL approximation upon a determination that the enable bit indicates that the [sign] has the first value (an activation function unit may contain a piecewise approximation arithmetic unit that includes a sign block in which the sign of the input is used to select an offset parameter and to adjust the final sign of the output – Falcón, paragraph 132; after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second multiplication between a second version of the weight [parameter value] and the input); note also that these second-layer calculations are not performed until the first-layer activation function is computed, including the determination of the sign bit), and the accumulator performs a second subtraction operation of a second interval version of the second parameter from a result of the second multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second accumulation between the result of the second multiplication and the product of a second version of the other weights [parameter value] and the other inputs to the neuron)).” 

Regarding claim 8, Falcón, as modified by Deisher and Pillai, discloses that “a result of the addition operation is saved as the [result] upon a determination that the sign has a second value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [second value = value that corresponds to a negative input] – Pillai, paragraph 62 [note that the result of the addition operation must be saved because it is later used for determining whether to output the input or 0]).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to save the value of the result upon determining the value of its sign, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 11, Falcón, as modified by Pillai and Deisher, discloses “performing a second processing cycle, including: 
performing an addition operation of a first interval version of a third parameter and the result of the subtraction operation during a second processing cycle of the PWL approximation (in an activation function unit of a neural network accelerator, after a multiply and accumulate step, the sum value is passed to another accumulator where a bias [third parameter] may be added to the neural network sum output [result of the subtraction operation]; the sum buffer may hold multiple sums, allowing the accumulation to generate an output every cycle despite the fact that the activation function may take several cycles to be performed – Deisher, paragraph 94 [i.e., the multiplication and accumulation may take place in one cycle and the addition of the bias term may take place in a second cycle]; see also paragraph 84 (disclosing that the MAC may require multiple clock cycles to receive an entire input vector over multiple input sets), Fig. 5; compare specification paragraph 213 (disclosing that the second-step addition is of bias term ci)).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falón and Pillai to add a third parameter to the result of the multiply-and-accumulate operations in a separate cycle, as disclosed by Deisher, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would add more flexibility to the system by not restricting the output to a product of weights and inputs and allowing an arbitrary constant to be added to the result.  See Deisher, paragraph 94.
 
Regarding claim 12, Falcón, as modified by Deisher and Pillai, discloses “determining whether the enable bit indicates that the sign of the result of the subtraction operation has a first value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [first value = value that corresponds to a positive input] – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to determine the value of the sign bit indicating the result of the accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 13, Falcón, as modified by Pillai and Deisher, discloses “performing a third processing cycle, including: 
performing a second multiplication operation of the input data value and a second interval version of the first parameter value upon a determination that the enable bit indicates that the [sign] has the first value (an activation function unit may contain a piecewise approximation arithmetic unit that includes a sign block in which the sign of the input is used to select an offset parameter and to adjust the final sign of the output – Falcón, paragraph 132; after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second multiplication between a second version of the weight [parameter value] and the input); note also that these second-layer calculations are not performed until the first-layer activation function is computed, including the determination of the sign bit); and 
performing a second subtraction operation of a second interval version of the second parameter from the result of the second multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second accumulation between the result of the second multiplication and the product of a second version of the other weights [parameter value] and the other inputs to the neuron)).”

Regarding claim 14, Falcón, as modified by Deisher and Pillai, discloses “saving a result of the addition operation upon a determination that the sign has a second value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [second value = value that corresponds to a negative input] – Pillai, paragraph 62 [note that the result of the addition operation must be saved because it is later used for determining whether to output the input or 0]).”   It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to save the value of the result upon determining the value of its sign, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 18, Falcón, as modified by Pillai and Deisher, discloses that “the accumulator performs an addition operation of a first interval version of a third parameter and the result of the subtraction operation during a second processing cycle of the PWL approximation (in an activation function unit of a neural network accelerator, after a multiply and accumulate step, the sum value is passed to another accumulator where a bias [third parameter] may be added to the neural network sum output [result of the subtraction operation]; the sum buffer may hold multiple sums, allowing the accumulation to generate an output every cycle despite the fact that the activation function may take several cycles to be performed – Deisher, paragraph 94 [i.e., the multiplication and accumulation may take place in one cycle and the addition of the bias term may take place in a second cycle]; see also paragraph 84 (disclosing that the MAC may require multiple clock cycles to receive an entire input vector over multiple input sets), Fig. 5; compare specification paragraph 213 (disclosing that the second-step addition is of bias term ci)).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Pillai to add a third parameter to the result of the multiply-and-accumulate operations in a separate cycle, as disclosed by Deisher, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would add more flexibility to the system by not restricting the output to a product of weights and inputs and allowing an arbitrary constant to be added to the result.  See Deisher, paragraph 94.

Regarding claim 19, Falcón, as modified by Deisher and Pillai, discloses that “a determination is made as to whether the enable bit indicates that the sign of the result of the subtraction operation has a first value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [first value = value that corresponds to a positive input] – Pillai, paragraph 62; see also paragraphs 31-32 (disclosing that the activation function operates on the result of multiply and accumulate operations, i.e., the subtraction operation)).”   It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to determine the value of the sign bit indicating the result of the accumulation operations, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Regarding claim 20, Falcón, as modified by Deisher and Pillai, discloses that “the multiplier performs a second multiplication operation of the input data value and a second interval version of the first parameter value during a third processing cycle of a PWL approximation upon a determination that the enable bit indicates that the has the first value (an activation function unit may contain a piecewise approximation arithmetic unit that includes a sign block in which the sign of the input is used to select an offset parameter and to adjust the final sign of the output – Falcón, paragraph 132; after four PU cycles, the PUs have both the inputs [input data values] and weights [parameter values] required for the computation so they can perform the dot product [multiplication] operation – Falcón, paragraph 62 [note that, since the output of the multiplication and accumulation is fed into the activation function unit, which performs the PWL approximation (see paragraph 114), this step can be regarded as part of the PWL approximation]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second multiplication between a second version of the weight [parameter value] and the input); note also that these second-layer calculations are not performed until the first-layer activation function is computed, including the determination of the sign bit), and the accumulator performs a second subtraction operation of a second interval version of the second parameter from a result of the second multiplication operation (after four PU cycles, the PUs have both the inputs and the weights required for the computation, so they can perform the dot product operation [in the first stage] and accumulate the result [in the second stage] with the previous computation [i.e., the dot product of the previous weights [second parameter] and the previous inputs] if the neuron has many inputs – Falcón, paragraph 62 [note that accumulation is adding the current result to the previous results, which is equivalent mathematically to subtracting the negative of the previous results]; see also Fig. 8 and paragraph 44 (showing a fully connected neural network with a plurality of layers, in which the output of the activation function of one layer is used as input to a neuron in the next layer, which then proceeds to perform a second accumulation between the result of the second multiplication and the product of a second version of the other weights [parameter value] and the other inputs to the neuron)).” 
 
Regarding claim 21, Falcón, as modified by Deisher and Pillai, discloses that “a result of the addition operation is saved as the [result] upon a determination that the sign has a second value (when an activation function circuit is configured to implement a ReLU function, the ReLU opcode bit is set to 1 and the remaining opcode bits are set to 0; the selection signal of one multiplexer is based on the sign bit of the input, so that the mux selects either the input or 0 as the output depending on whether the input is positive or negative [second value = value that corresponds to a negative input] – Pillai, paragraph 62 [note that the result of the addition operation must be saved because it is later used for determining whether to output the input or 0]).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Falcón and Deisher to save the value of the result upon determining the value of its sign, as disclosed by Pillai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would assist the system in determining what the result of the activation function should be, since the value of the activation function ultimately depends on the sign of the input.  See Pillai, paragraph 62 (disclosing that the output of the ReLU function is 0 when the sign of the input is negative and is equal to the original input when the sign of the original input is positive).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7:50a-5:50p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.C.V./             Examiner, Art Unit 2125                                                                                                                                                                                           
/BRIAN M SMITH/             Primary Examiner, Art Unit 2122                                                                                                                                                                                           


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 The specification does not define the term “interval version,” and Examiner cannot find any evidence that it was an accepted term of art before the effective filing date.  The closest the specification comes to elucidating the term is at paragraph 214, which discloses that a range over which input data are received is divided into non-uniform intervals.  Thus, for purposes of examination, an “interval version” will be construed to mean any instantiation of a number or range of numbers.