DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This Office Action is in response to applicant’s communication filed 30 June 2022, in response to the Office Action mailed 31 March 2022.  The applicant’s remarks and any amendments to the claims or specification have been considered, with the results that follow.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 5-9, 11-15, 17-21, and 23-24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen (US 5,226,092) in view of Dushkoff et al. (Adaptive Activation Functions for Deep Networks, Feb. 2016, pgs. 1-5), and further in view of Gomolka (Backpropagation algorithm with fractional derivatives, Oct. 2018, pgs. 1-10).

As per claim 1, Chen teaches a computing system comprising: a network controller [the system may include a processor connected to multiple other devices via I/O and including input processing (figs. 5, 6A, etc.); where, besides input processing, connection to the external devices in a network inherently requires control]; a processor coupled to the network controller [the system may include a processor connected to multiple other devices via I/O (figs. 5, 6A, etc.)]; and a memory coupled to the processor, the memory including a set of instructions [the system includes a processor connected to a memory containing the modules to implement the modules (figs. 5, 6A, etc.); which is a set of instructions on what operations to perform], which when executed by the processor, cause the computing system to: adjust a plurality of weights in a neural network model [weights of a neural network are adjusted during training (col. 7, line 57 to col 8, line 24; etc.)], generate partial derivatives of a plurality of activation functions in the neural network model [the error value for a unit is determined using the partial derivative of the error with respect to a change in the output of the unit using the activation function of the unit (col. 13, lines 21-37; etc.) where the units are the neurons of the neural network (col. 1, lines 55-60; etc.)], generate a plurality of difference values based on the partial derivatives [the error value for a unit is determined using the partial derivative of the error with respect to a change in (difference value) the output of the unit using the activation function of the unit (col. 13, lines 21-37; etc.)], control the activation function to adjust, via the plurality of difference values on a per neuron basis, the plurality of neurons of activation functions in the neural network model [after computing error/delta values for the output units, the error terms are propagated back to all of the units (neurons) in the network to compute a delta value for each unit to modify each unit weighting (col. 13, lines 38-67; etc.) where the activation function is a logistic function of the inputs and weights of the units connected to the current unit (col. 9, lines 30-50; col. 16, lines 5-62; etc.); so modifying the weights of the neurons changes the plurality of activation functions for each neuron], and output the neural network model in response to one or more conditions being satisfied by the plurality of weights and the plurality of activation function [training is performed to find a minimum error value (col. 14, lines 14-68; etc.)].
While Chen teaches generating the partial derivatives of the neuron using the activation function of each neuron, and calculating delta values for each neuron to make modification to the neuron (see above) it does not explicitly teach generating partial derivatives of a plurality of activation functions in the neural network model with respect to fractional derivative values, wherein the fractional derivative values are part of the plurality of activation functions and control whether the plurality of activation functions morph during training, and controlling the fractional derivative values to adjust, via the plurality of difference values on a per neuron basis, the plurality of activation functions in the neural network model.
Dushkoff teaches adjusting via a plurality of difference values on a per neuron basis, a plurality of activation functions in the neural network model to morph the plurality of activation functions [each node in the network may use adaptive activation functions which are optimized to find the best activation function for each by gating certain activation functions, where the selection of the functions is treated as an optimization parameter during the gradient descent (i.e., vs target output) (pg. 2, Adaptive Activation Functions)].
Chen and Dushkoff are analogous art, as they are within the same field of endeavor, namely optimizing the activation functions of a neural network using gradient descent.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include adaptive activation functions, including optimizing the selected activation function for each neuron based on the output during training, as taught by Dushkoff, in the optimization of the neurons based on the error/delta values calculated for each neuron during training in the system taught by Chen.
Dushkoff provides motivation as [allowing individually adaptive activation functions improves the performance of the network (pg. 1, abstract; pg. 2, Adaptive Methods; etc.)].
Gomolka teaches generating activation functions in the neural network model with respect to fractional derivative values, wherein the fractional derivative values are part of the plurality of activation functions and control whether the plurality of activation functions morph during training [during training using gradient descent (pgs. 1-2, section 1) using a fractional derivative to model the individual neurons and minimize the error function to shape the activation functions of individual neurons (pg. 1, abstract, etc.); for morphing each activation function in Chen/Dushkoff, above], and controlling the fractional derivative values to adjust, via the plurality of difference values on a per neuron basis, the plurality of activation functions in the neural network model [during training using gradient descent (pgs. 1-2, section 1) using a fractional derivative to model the individual neurons activation functions, and minimizing the error function to shape the activation functions of individual neurons (pg. 1, abstract; pg. 3, section 2; etc.); for adjusting the activation functions using the difference values in Chen/Dushkoff, above].
Chen/Dushkoff and Gomolka are analogous art, as they are within the same field of endeavor, namely optimizing the neural network using gradient descent.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize fractional derivatives for modeling the neurons and minimizing the error during training, as taught by Gomolka, in the neuron optimization for minimizing error during training in the system taught by Chen/Dushkoff.
Gomolka provides motivation as [the proposed algorithm allows the learning process to be conducted with a smooth modification of the shape of the transition function without the need for modifying the IT model of the designed neural network (pg. 1, abstract, etc.)].

As per claim 2, Chen/Dushkoff/Gomolka teaches wherein two or more of the plurality of activation functions are to be different from one another [after computing error/delta values for the output units, the error terms are propagated back to all of the units (neurons) in the network to compute a delta value for each unit to modify each unit weighting (Chen: col. 13, lines 38-67; etc.) where the activation function is a logistic function of the inputs and weights of the units connected to the current unit (Chen: col. 9, lines 30-50; col. 16, lines 5-62; etc.); where each node in the network may use adaptive activation functions which are optimized to find the best activation function for each by gating certain activation functions, where the selection of the functions is treated as an optimization parameter during the gradient descent (Dushkoff: pg. 2, Adaptive Activation Functions)].

As per claim 3, Chen/Dushkoff/Gomolka teaches wherein to adjust the plurality of activation functions on the per neuron basis, the instructions, when executed, cause the computing system to: apply a first difference value to a first activation function of a first neuron; and apply a second difference value to a second activation function of a second neuron [after computing error/delta values for the output units, the error terms are propagated back to all of the units (neurons) in the network to compute a delta value for each unit to modify each unit weighting (Chen: col. 13, lines 38-67; etc.) where the activation function is a logistic function of the inputs and weights of the units connected to the current unit (Chen: col. 9, lines 30-50; col. 16, lines 5-62; etc.); where each node in the network may use adaptive activation functions which are optimized to find the best activation function for each by gating certain activation functions, where the selection of the functions is treated as an optimization parameter during the gradient descent (Dushkoff: pg. 2, Adaptive Activation Functions)].

As per claim 5, Chen/Dushkoff/Gomolka teaches wherein the instructions, when executed, cause the computing system to select the plurality of activation functions from a library of trainable activation functions [each neuron may select an activation function from a library (Dushkoff: pg. 1, abstract; etc.)].

As per claim 6, Chen/Dushkoff/Gomolka teaches wherein the one or more conditions include an accuracy condition [training is performed to find a minimum error value (Chen: col. 14, lines 14-68; etc.) using a fractional derivative to model the individual neurons and minimize the error function to shape the activation functions of individual neurons (Gomolka: pg. 1, abstract, etc.); where error is an accuracy condition (a lack of accuracy)].

As per claim 7, see the rejection of claim 1, above, wherein Chen/Dushkoff/Gomolka also teaches one or more substrates; and logic coupled to the one or more substrates, wherein the logic is implemented at least partly in one or more of configurable logic or fixed-functionality hardware logic, the logic to perform the steps [he system includes a processor connected to a memory containing the modules to implement the modules (figs. 5, 6A, etc.); where a processor performing operations inherently requires a substrate with implemented logic, which must be either configurable or fixed-function (as those are the only two options)]. 

As per claim 8, see the rejection of claim 2, above.

As per claim 9, see the rejection of claim 3, above.

As per claim 11, see the rejection of claim 5, above.

As per claim 12, see the rejection of claim 6, above.

As per claim 13, see the rejection of claim 1, above, wherein Chen/Dushkoff/Gomolka also teaches at least one computer readable storage medium comprising a set of instructions, which when executed by a computing system cause the computing system to perform the steps [the system includes a processor connected to a memory containing the modules to implement the modules (figs. 5, 6A, etc.); which is a set of instructions on what operations to perform].

As per claim 14, see the rejection of claim 2, above.

As per claim 15, see the rejection of claim 3, above.

As per claim 17, see the rejection of claim 5, above.

As per claim 18, see the rejection of claim 6, above.

As per claim 19, see the rejection of claim 1, above.

As per claim 20, see the rejection of claim 2, above.

As per claim 21, see the rejection of claim 3, above.

As per claim 23, see the rejection of claim 5, above.

As per claim 24, see the rejection of claim 6, above.


Response to Arguments
The rejections of claims 13-18 under 35 U.S.C. 101 have been withdrawn due to the amendments filed.

Applicant’s arguments with respect to claim(s) 1-3, 5-9, 11-15, 17-21, and 23-24 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 4, 10, 16, and 22 are cancelled; claims 1-3, 5-9, 11-15, 17-21, and 23-24 are rejected.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Sadowski (US 2021/0056403) – discloses iteratively updating weights of a neural network until the error falls below a threshold.
Toomarian (US 5,428,710) – discloses using partial derivatives of each neuron output and construction an error vector of difference values.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEORGE GIROUX/Primary Examiner, Art Unit 2128