Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .	 
EXAMINER’S AMENDMENT
Authorization for this examiner’s amendment was given in an telephonic conversation with Attorney Michael J Lenisa on 4/21/2022 .
The application has been amended as follows: 
Claim 1. (Currently Amended) An apparatus for training artificial intelligence (AI) models, comprising:
an input interface to receive in real time model training data from one or more sources to train one or more artificial neural networks (ANNs) associated with the one or more sources, each of the one or more sources associated with at least one of the ANNs; 
a load distributor coupled to the input interface to distribute in real time the model training data for the one or more ANNs to one or more AI appliances; 
a resource manager coupled to the load distributor to dynamically assign one or more computing resources on one of the AI appliances to each of the ANNs in view of amounts of the training data received in real time from the one or more sources for their associated ANNs; and 
a traffic predictor to decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs, wherein the resource manager is to scale up or scale down the computing resources assigned to the at least one of the ANNs in response to an instruction received from the traffic predictor.

Claim 11. (Cancelled) 

Claim 13. (Currently Amended) The apparatus of claim 1, further comprising an ANN accumulator, coupled to the resource manager, to periodically collect versions of an ANN as trained on each of the computing resources assigned to the ANN, combine them into [[an]] a composite version of the ANN, and store the composite version.

Claim 16. (Currently Amended) One or more non-transitory computer-readable storage media comprising a plurality of instructions that in response to being executed cause a network gateway device, to:
receive in real time model training data from one or more sources to train one or more artificial neural networks (ANNs) associated with the one or more sources, each of the one or more sources associated with at least one of the ANNs; 
distribute in real time the model training data for the one or more ANNs to one or more AI appliances; 
dynamically assign one or more computing resources on one of the AI appliances to each of the ANNs in view of amounts of the training data received in real time from the one or more sources for their associated ANNs; [[and]]
decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs; and
scale up or scale down the computing resources assigned to the at least one of the ANNs in response to the determination of whether to increase or decrease the computing resources assigned to the at least one of the ANNs.

Claim 19. (Cancelled) 

Claim 22. (Currently Amended) A method of managing artificial intelligence (AI) model training data on a network gateway device, comprising:
receiving in real time model training data from one or more sources to train one or more artificial neural networks (ANNs) associated with the one or more sources, each of the one or more sources associated with at least one of the ANNs; 
distributing in real time the model training data for the one or more ANNs to one or more AI appliances; 
dynamically assigning one or more computing resources on one of the AI appliances to each of the ANNs in view of amounts of the training data received in real time from the one or more sources for their associated ANNs; [[and]]
deciding whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs; and
scaling up or scaling down the compute resources assigned to the at least one of the ANNs based in response to the decision to increase or decrease of the computing resources assigned to the at least one of the ANNs.

Claim 23. (Currently Amended) The method of claim 22, further comprising:
monitoring traffic volume from all of the sources associated with an ANN; and deciding whether to increase or decrease the computing resources assigned to the ANN, based, at least in part, on the aggregated traffic volume of the ANN.

Examiner’s Statement of Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: 
Regarding Independent claims 1, 16, and 22, are deemed to be allowable over the prior art as neither Sundararaman et al. (US 10671916 B1) hereinafter Sun nor Tan (US 20100169253 A1) nor Tseng et al. (US 20200302292 A1) nor Lie et al (US 20190332926 A1) nor Cao et al. (US 20190130266 A1) nor Bigus (USPN 5704012) nor the combination teach or suggest the limitation of the independent claims. 
More importantly none of the above recited prior art teach or suggest “a traffic predictor to decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs” in combination with other limitations recited in the independent claims.
The closes prior art is Sun, Sun teaches an apparatus for training artificial intelligence (Al) models, comprising: (Col 1 Line 53-Col 2 Line 12; system for training mathematical models; Col 30 Lines 25-33; neural network models; Col 35 Lines 63-67; deep learning and neural network models) an input interface (computing platform that can send and receive data) to receive in real time model training data (multiple data streams for training) from one or more sources (Col 5 Lines 58-67; training data stream sources within the computing system) to train one or more artificial neural networks (ANNs) (examiner notes by definition neural networks are artificial) associated with the one or more sources, (Fig 20A; Col 2 Lines 14-59; receiving multiple data streams from multiple sources, the data streams consisting of real time data for training the plurality of neural network models) each of the one or more sources associated with at least one of the ANNs; (Col 68 Lines 45-55; training data sources for a neural network model; Col 5 Lines 58-67; the one or more data sources being part of the computing platform) a load distributor (computing platform) coupled to the input interface (computing platform sending and receiving therefore the hardware must exist to be able to communicate in a network) to distribute (distributing training data sets) in real time the model training data (Col 66 Lines 5-40; the computing platform provides real time training data incoming from the streams to computing elements associated with each machine learning process) for the one or more ANNs (Col 68 Lines 45-55; where each mathematical model is a neural network model) to one or more Al appliances (computing elements associated with the specific machine learning process for real time training of the mathematical models); (Col 66 Lines 5-40; the computing platform provides real time training data incoming from the streams to computing elements associated with each machine learning process) and a resource manager (computing platform) coupled to the load distributor (see mapping above) to dynamically assign one or more computing resources (computing elements) on one of the Al appliances (adding a computing element to the group of computing elements training a specific model) to each of the ANNs (Col 68 Lines 45-55; where each mathematical model is a neural network model;) (Col 1 Lines 52 [Wingdings font/0xE0] Col 2 Line 12; allocating computing elements to a specific machine learning process and reallocating computing elements (equivalent to dynamically assigning) during peak times; the examiner notes to the previous mapping to show that the machine learning process is for a mathematical model, and that mathematical model is a neural network model and therefore equivalent to the ANN) in view of amounts of the training data received in real time from the one or more sources for their associated ANNs (Col 2 8-17 & 45-55; Col 1 Lines 52 [Wingdings font/0xE0] Col 2 Line 12; Col 66 Lines 5-40; Col 68 Lines 45-55; real time reallocation of computing resource to the training model that is lagging or experienced extra load through the streams coming in for training the model) the resource manager (computing platform) to, prior to a scale down of computing resources (Col 24 Lines 58-65; additional processors; Col 50 Lines 43-51; adding/placement (by definition is scaling up) and removing compute elements (by definition is scaling down) to the cluster of compute elements service the algorithm; furthermore it is done based on a computational bases and therefore can be added, functionality is run, and then it is removed, equivalent to prior to scale down) that are assigned to an ANN, (that are used in the algorithms for the neural network) collect a current version of the46 Attorney Docket No.: 127075-235439 (AA3299-US) Date of Transmission: December 28, 2017ANN from each of the computing resources to be removed and send the current versions to the ANN accumulator.  (Col 30 Lines 40 – Col 31 Line 67; Updating the mathematical model after every training and merging them together (equivalent to ANN accumulator); Col 64 Lines 36- Col 65 Lines 67; Fig 19A-19B; maintaining by the computing platform state and versions of the algorithms, and when re-allocating computing resources the memory is transferred from one computing element to another; teaches the updating of versions based on aggregating models, PRIOR TO REALLOCATION OF COMPUTE RESOURCE while accounting and managing and migrating recourse (scaling up and down)) wherein the computing resources include one or more processors or portions of processors disposed on the one or more AI appliances (Col 2 Lines 38-59; reallocating computing elements to Machine learning training process (equivalent to AI appliance) Col 24 Lines 58-65; Fig 20B; the resources include additional processor of the system) wherein the computing resources (computing elements) assigned to an ANN (Col 35 Lines 63-67; Col 47 Lines 1-5; neural network model) include multiple processors (Col 24 Lines 58-65; Fig 20B; the resources include additional processor of the system) and training of the ANN is partitioned between the multiple processors (Col 2 Lines 13-59; reallocating computing elements to process for training machine learning algorithms for the neural network model) wherein the one or more sources of model training data (data gathering for training) include Internet of Things (IoT) devices (iot setup) or edge clients (Col 5 Lines 58-67; Col 57 Lines 11-17; the data stream for training can come from a source within the computing platform; Col 54 Lines 16-53 where the computing platform can have be IoT setup; therefore if the platform is an iot platform, and the stream for training is coming from devices within the platform then its equivalent to (iot devices)) wherein the resource manager (computing platform) prior to a scale up of computing resources (using additional resources) to be assigned to an ANN, (neural network) is further to identify additional available computing resources and register them with the load distributor (computing platform) (Col 43 Line 35 -> Col 44 Line 23; the system determines the additional computing elements that are available within the computing platform (equivalent to registered) and if they are registered prior to the execution of the algorithms then it was performed prior to scaling) further comprising an ANN accumulator, (computing platform) coupled to the resource manager, (computing platform) to periodically collect versions of an ANN as trained on each of the computing resources assigned to the ANN, (Fig 64 Line 25 –> Col 65 Line 13; keeping track of the state and version of the algorithms used in neural networks as they are trained using the data sets collected in the stream)  combine them into an composite version of the ANN, and store the composite version , (Fig 64 Line 25 –> Col 65 Line 13; merging the mathematical models into a single updated mathematical model for neural networks) wherein the one or more AI appliances are provided in a chassis, (Col 38 lines 55-61;computing elements executing the machine learning algorithms on hardware which is enclosed in a casing which is equivalent to a chassis) and coupled to a hardware fabric interface (HFI), (Col 8 Lines 5-10; network interface case; Col 38 Lines 55-61 a switching network which includes switching fabric) to provide an interface to tunnel communications between each AI appliance and the load distributor (Col 24 Lines 30-45; distributing the data sets between the computing elements)
Sun is different in that Sun teaches adjusting resources based on incoming traffic (Col 35 Lines 63-67; Col 47 Lines 1-5; neural network model)  (Col 69 Lines 1-18; adjusting computing resource to help handle the peak demand based on the amount of data is incoming from the stream in real-time; see mapping about in claim 1 showing the stream being the training data). Sun is silent on predicting future traffic for an ANN in training, and adjusting resources based on the predicted traffic. Therefore Sun does not disclose “a traffic predictor to decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs”.
The second closest prior art is Tan [0016] In accordance with one or more such embodiments, there is a system for: training a first artificial neural network (ANN); using the first ANN to predict a workload (i.e., the predicted workload) for a particular one of a number of hosts; and sending an indication to at least one of the hosts to migrate at least one of a number of computing tasks away from the particular host. In accordance with one or more such embodiments, training data for the ANN is based on a distribution over time of a computing workload for the particular host. In accordance with one or more such embodiments, the indication is sent when the system is operating in a proactive mode and when the predicted workload is outside of a proactive operating range for the particular host.[0017] Some embodiments include: monitoring the computing workload for the particular host; and automatically switching to the proactive mode when a difference between the monitored workload and the predicted workload is less than an autostart accuracy threshold. Other embodiments include monitoring the computing workload for the particular host; and automatically switching to a reactive mode when the monitored computing workload is outside of a failsafe operating range for the particular host. When in the reactive mode, the migration indication is sent based on the monitored workload.
Tan is different in that Tan uses an ANN to predict workload and does not predict workload for an ANN and therefore does not disclose “a traffic predictor to decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs”.
Tseng teaches [0095] Subsequently, in operation S4-6, the apparatus 200 determines whether each of the trained neural networks satisfies the one or more imposed constraints (e.g. those received/determined in operation S4-1), e.g. comparing the monitored computational resource use with the one or more imposed constraints. For instance, the apparatus 200 may determine whether the accuracy of the trained neural network satisfies the minimum acceptable accuracy constraint and/or whether the computational resource usage of the trained neural network satisfies the computational resource constraint.
Lie teaches [0591] Unlike gradient-descent techniques (e.g., SGD and MBGD) that use a full forward pass and a full backward pass through a network to compute a gradient estimate, and thus result in a sequential dependency, CPGD uses a differential construction to replace the sequential dependency with a continuous model that has sustained gradient generation. In some embodiments and/or usage scenarios, CPGD enables layer parallelism by enabling each layer of a neural network to be trained (e.g., to ‘learn’) concurrently with others of the layers without explicit synchronization. Thus, parallelization along the depth of a neural network enables applying more computing resources to training In various embodiments and/or usage scenarios, CPGD provides comparable accuracy and improved convergence rate expressed in epochs of training compared to other techniques.
Cao teaches [0059] With the Regularizer 112, GAN 100 may be trained in a faster and more robust manner, leading to reduced computational complexity and increased efficiency in allocating computing resources during training of a neural network. This may be achieved by better allocating and directing D's finite parameter resources on real and generated data.
Bigus In a system comprising a plurality of resources for performing useful work, a resource allocation controller function, which is customized to the particular system's available resources and configuration, dynamically allocates resources and/or alters configuration to accommodate a changing workload. Preferably, the resource allocation controller is part of the computer's operating system which allocates resources of the computer system. The resource allocation controller uses a controller neural network for control, and a separate system model neural network for modelling the system and training the controller neural network. Performance data is collected by the system and used to train the system model neural network. A system administrator specifies computer system performance targets which indicate the desired performance of the system. Deviations in actual performance from desired performance are propagated back through the system model and ultimately to the controller neural network to create a closed loop system for resource allocation.
Tseng, Lie, Cao, and Bigus are different in that they disclose adjusting resource for an neural network in training, but do not disclose that is performed based on future traffic prediction, and therefore do not disclose “a traffic predictor to decide whether to increase or decrease one or more computing resources assigned to at least one of the ANNs based on an artificial intelligence based model that predicts future traffic volume for the at least one of the ANNs”.

In the examiner’s opinion it would not have been obvious to one of ordinary skill in the art prior to the effective filing date of the application to modify  the teachings of the above recited prior art to teach predicting future traffic for an artificial neural network being training using real time network traffic as recited in the claims. 
Therefore Independent claims 1, 16 and 22 are deemed to be allowable over the prior art and dependent claims 2-10, 12-13, 15, 17-18, 20, 23-27 are deemed to be allowable in light of their dependency from an allowable claim.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDERRAHMEN H CHOUAT whose telephone number is (571)431-0695. The examiner can normally be reached 9AM-5PM Tentative.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christopher Parry can be reached on 571-272-8328. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Abderrahmen Chouat
Examiner
Art Unit 2451



/Chris Parry/Supervisory Patent Examiner, Art Unit 2451