DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office correspondence is in response to “Amendment and Response under 37 C.F.R. 1.111” filed on June 22, 2022.
Claims 1 – 8 and 10 -20 are pending.
Claims 1, 11, and 16 are amended.
Claim 9 is canceled.
Claims 1 – 8 and 10 -20 are rejected.
Response to Arguments
Applicant’s arguments filed on 6/22/2022 have been fully considered and at least one argument was persuasive in regard to the rejection of claims 1 – 8 and 10 -20  under 35 U.S.C. 103 and said rejections from the prior office action is withdrawn.  However, applicant’s amendments precipitated a new search and consideration of the amended claims and new grounds of rejection were found for claims  1 – 8 and 10 -20 under 35 U.S.C. 103.  The examiner here now responds to each argument.  Underlined text represents amendments to the claims made subsequent to the prior office action.
In regard to claims  1 – 6, 9, 11 – 20 the applicant argues that the prior art combination of Chew and Ghibril fails to explicitly teach, suggest or disclose:
A) “receiving a machine learning model for the fog node from a cloud, wherein the cloud
provides the machine learning model with its initial training, the machine learning model being
used to optimize performance of at least the fog node at the fog layer;” (as recited in claim 1 and substantially replicated in claims 11 and 16).
The applicant states:
“ . . . Chew provides a method for an analytics engine hosted by a cloud based server to assist in training an analytics engine hosted by an edge server. This may improve the prediction accuracy of the analytics engine hosted by the edge server. See Chew, Abstract and paragraph [0021]. Moreover, paragraph [0041] of Chew describes that the analytics engine hosted by the cloud server 204 may be used to proactively train an analytics engine hosted by the edge server to increase the prediction accuracies. Chew explicitly describes a process for optimizing performance of a specific application (e.g., an analytics engine) at an edge device and not the performance of the edge device itself. In contrast, claim 1 recites, inter alia, “receiving a machine learning model for the fog node from a cloud, wherein the cloud provides the machine learning model with its initial training, the machine learning model being used to optimize performance of at least the fog node at the fog layer.” Chew fails to teach or suggest this feature. . . “) (Applicant’s remarks pages 7 -8)

A) In response to the applicant’s argument:
The applicant amended the claims under review to require that the machine learning model optimizes the performance of at least the fog node at the fog layer.  This new requirement is not explicitly  taught by the  prior art combination of Chew and Ghibril.  As it was observed in the analysis of cancelled claim 9, Chew discloses the training of analytics engines for to increase production accuracies within different edge devices but does not specifically call for optimizing the performance of at least the fog node at the fog layer, as the amended independent claims now require.  Therein, the applicant’s argument is persuasive and the rejections under 35 U.S.C.  103 under Chew and Ghibril are withdrawn.  However, the applicant’s amendment required a new search and consideration to be performed, which resulted in introducing a new ground of rejection under 35 USC 103 as the amended claims being un-patentable over Chew (U.S. 2018/0285767 A1; herein referred to as Chew) in view of Ghibril et al. (U.S. 2019/0164087 A1; herein referred to as Ghibril) in further view of Satou (U.S. 2020/0101603 A1; herein referred to as Satou).  The new prior art reference Satou is analogous art that when combined with the previously cited art teaches the machine learning model being used to optimize performance of at least the fog node at the fog layer.  Specifically Satou discloses a controller and a control system used for manipulating a robot device (see ¶ [0002]).  The control system comprises a layer of fog computers in communication with edge devices used for the control of the robot (see Fig. 6, ¶  ¶ [0060-0062]).  Further, the fog computer includes a machine learning device that learns optimal adjustments to operate the controller of the robot (see Fig. 9,  ¶  ¶ [0073-0075]), and the operational performance of the entire control system including the fog computer is optimized using the knowledge of the  learning models that was generated and an optimized model is distributed to each controller of the system (see Fig. 10,  ¶  ¶ [0076-0083]).  The applicant is directed to the respective rejections described below.  

B) “identifying an amount of spare resources that are available, wherein the amount of spare
resources corresponds to a difference between a current amount of resources being used that is
less than the amount of resources needed for normal operations of the fog node;
allocating the identified amount of spare resources for training the machine learning
model of the fog node;” (as recited in claim 1 and substantially replicated in claims 11 and 16)
The applicant states:
“ . . . On page 5 of the Office Action, the examiner correctly recognizes that Chew fails to teach or suggest these features. However, the examiner is relying on paragraphs [0043] and [0053] of Ghibril to teach the same.

At paragraph [0043], Ghibril describes a process of determining a negative impact or reduction in available resources at a network node, in response to which a predictive auto-scaling IE may determine to commission additional resources. In other words, Ghibril describes a process whereby an amount of additional resources needed is identified and not an amount of spare resources, as defined by claim 1.

Paragraph [0053] of Ghibril describes a process according to which a provisioning element may deploy the necessary resources and orchestrate their usage in the network. Because Ghibril does not describe a process of identifying spare resources, Ghibril also fails to teach or suggest allocating an identified amount of spare resources for training the machine learning model of a fog node, as required by claim 1.

For the foregoing reasons, a hypothetical combination of Chew and Ghibril fails to render obvious the features recited in claim 1. Claims 11 and 16 recite features that are somewhat similar to those recited in claim 1. Therefore, a hypothetical combination of Chew and Ghibril also fails to render obvious the features recited in claims 11 and 16 as well as claims 2-6, 12-15, and 17-20 that depend from one of claims 1, 11, and 16. Claim 9 has been canceled.

Accordingly, the undersigned representative respectfully requests reconsideration and withdrawal of the rejection of claims 1-6, 9, and 11-20 under 35 U.S.C. 103. . . . “ (Applicant’s Remarks page 8)

B) In response to the applicant’s argument:
The applicant’s arguments that the “spare resources” recited in the claim is distinctive from the “additional resources” disclosed in the prior art Ghibril is not persuasive because the broad language of the limitation does not distinguish differences in the processes described in Ghibril and the instant application using a broadest reasonable interpretation analysis (BRI).   It is established practice in patent prosecution that the claims be interpreted in there broadest reasonable interpretation.   See MPEP 2111 (“an examiner must construe claim terms in the broadest reasonable manner during prosecution as is reasonably allowed in an effort to establish a clear record of what applicant intends to claim. Thus, the Office does not interpret claims in the same manner as the courts. In re Morris, 127 F.3d 1048, 1054, 44 USPQ2d 1023, 1028 (Fed. Cir. 1997); In re Zletz, 893 F.2d 319, 321-22, 13 USPQ2d 1320, 1321-22 (Fed. Cir. 1989). Because applicant has the opportunity to amend the claims during prosecution, giving a claim its broadest reasonable interpretation will reduce the possibility that the claim, once issued, will be interpreted more broadly than is justified. In re Yamamoto, 740 F.2d 1569, 1571 (Fed. Cir. 1984); In re Zletz, 893 F.2d 319, 321, 13 USPQ2d 1320, 1322 (Fed. Cir. 1989) (“During patent examination the pending claims must be interpreted as broadly as their terms reasonably allow.”); In re Prater, 415 F.2d 1393, 1404-05, 162 USPQ 541, 550-51 (CCPA 1969)).   Therein, as was discussed above, claims 1 – 6, and 11 – 20 will be rejected under 35 USC 103 as being un-patentable over Chew (U.S. 2018/0285767 A1; herein referred to as Chew) in view of Ghibril et al. (U.S. 2019/0164087 A1; herein referred to as Ghibril) in further view of Satou (U.S. 2020/0101603 A1; herein referred to as Satou).  
Further, as a result of the further search and consideration necessitated by the applicant’s amendments to claims 1, 11, and 16, new grounds of rejection were found for dependent claims 7, 8 and 10 under 35 USC 103 as being un-patentable over Chew (U.S. 2018/0285767 A1; herein referred to as Chew) in view of Ghibril et al. (U.S. 2019/0164087 A1; herein referred to as Ghibril) in further view of Satou (U.S. 2020/0101603 A1; herein referred to as Satou) in further view of Prakash et al. (U.S. 2019/0138934 A1; herein referred to as Prakash).  The applicant is directed to the respective rejections described below.
The examiner recommends that the applicant review the specification for disclosure that if integrated into the independent claims would distinguish the amended claims from the cited prior art.  For example, the disclosure of the instant application discusses a process for sharing models among the fog nodes which may be distinctive from the cited prior art (see instant application Fig. 3 – 4.  ¶  ¶ [0039-0055]).  
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on  04/15/2022 was filed after the mailing date of the non-final office action on 03/22/2022.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.  The applicant is invited to contact the examiner for an interview to discuss how to move the prosecution forward.
Double Patenting Analysis
The applicant has co-pending application 16/298881 which has been identified to be relevant to the instant application.  At this time of examination, the instant application appears to claim only subject matter directed to an invention that is independent and distinct from that claimed in the co-pending application patent and names the inventor or at least one joint inventor named in the patent.  Therein, no non-statutory Double Patenting rejection has been applied.  The applicant is required to maintain a clear line of demarcation between the instant applications claims and the co-pending application claims during prosecution, as the Double Patenting analysis can be revisited if the claims of the instant application and the co-pending application patent converge to claiming the same subject matter.  The applicant may wish to proactively file a terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) to overcome possible future Double Patenting rejections.
35 USC § 101 Analysis
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 

Claims 1 – 8 and 10 -20 are directed to statutory subject matter.  The claims are directed to non-abstract improvements in computer related technology.  A claim is non-statutory when it is directed to a judicial exception (e.g. either one of mathematical concepts, mental processes, or certain methods of organizing human activity) without significantly more.  The claimed invention is not directed to a judicial exception.  Instead, the claimed invention is directed to a technological improvement for distributed machine learning to train a fog node in a fog layer where a computerized system receives a machine learning model for the fog node from a cloud, wherein the cloud provides the machine learning model with its initial training, and therein monitoring resources being used at the fog node, wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node, and an amount of spare resources are identified, wherein the amount  of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node, and further allocating the identified amount of spare resources for training the machine learning model of the fog node,  and training the machine learning model of the fog node using the identified amount of spare resources.   The claimed invention provides a specific improvement to training fog nodes by enabling the training to be done directly on the fog / edge node when resources exist to do so, providing efficiencies to the training process.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1 – 6, and 11 - 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chew (U.S. 2018/0285767 A1; herein referred to as Chew) in view of Ghibril et al. (U.S. 2019/0164087 A1; herein referred to as Ghibril) in further view of Satou (U.S. 2020/0101603 A1; herein referred to as Satou).  
In regard to claim 1, Chew teaches a computer-implemented method for training (see ¶ [0021] “ . . . The techniques described herein provide a method for an analytics engine hosted by a cloud based server to assist in training and analytics engine hosted by an edge server . . .”) a fog node in a fog layer (see ¶ [0049] “ . . .  FIG. 3 is a drawing of a computing network300, in which the cloud 202 is in communication with a mesh network of IoT devices, which may be termed a fog device 302, operating at the edge of the cloud 202 in accordance with some embodiments. Like numbered items are as described with respect to FIG. 2. As used herein, a fog device 302 is a cluster of devices that may be grouped to perform a specific function, such as providing content to active signs, traffic control, weather control, plant control, home monitoring, and the like . . .”), the method comprising (see ¶ [0044] “ . . . The techniques provide a method to enable an analytics engine hosted by cloud server to train a less powerful analytics engine hosted by an edge server . . .”):
receiving a machine learning model for the fog node from a cloud (see ¶¶ [0041-0042] “ . . . Accordingly, the analytics engine hosted by the cloud server 204 may be used to proactively train an analytics engine hosted by the edge server to increase the prediction accuracies. This allows the implementation of a more economical client or edge hosted machine learning analytics engine that has increasing prediction accuracies and near real-time actionable items feedback. For the traffic control group 206, the increasing prediction accuracies may improve traffic flow through the intersection over time . . .  IoT devices in a group, such as at a mall or airport, may be active signs that determine when a user is viewing the sign. An independent software vendor (ISV) may implement an anonymous video analysis (AVA) algorithm on a gateway, one or more of the active signs, or on another edge server, and utilize the services of a cloud analytics solution provider, for example, hosted by one of the cloud servers 204, to progressively train the AVA analytics algorithm. . . .”), wherein the cloud provides the machine learning model with its initial training (see Fig. 4, ¶ [0082] “ . . . FIG. 4 is a schematic diagram 400 of edge server devices that include analytics engines that may be trained by an analytics engine 402 in a cloud server 404 in accordance with some embodiments. The data used for training the analytics algorithm by machine learning may be properly dispositioned, e.g., classified, prior to training. At the highest level, for example, for the analytics engine 402 hosted by the cloud server 404, the classification may be manually performed, such as by a machine learning expert. For example, for an AVA analysis, the machine learning expert may identify features of images, such as age, gender, emotion, formality of dress, and the like. The cloud hosted analytics engine may be well-trained and optimized for relatively high prediction accuracies . . .”) ;
Chew fails to expclitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer; monitoring resources being used at the fog node, wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node; identifying an amount of spare resources that are available, wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node; locating the identified amount of spare resources for training the machine learning model of the fog node; and training the machine learning model of the fog node using the identified amount of spare resources.
However Ghibril teaches monitoring resources being used at the fog node (see ¶ [0031] “ . . . An edge device is a type of computing device 130 having resources to host at least one service component of an information service and also function as an access point to a network for providing the information service. The resources in the edge device can be either hardware or software that are configurable based on a command received from an external source (e.g., an orchestrator 140). Example edge devices may include micro data centers, edge routers, provider edge routers, aggregation routers, customer premise equipment (CPE), set-top boxes, cloudlets, fog nodes . . .”), wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node (see Fig. 4 see ¶ [0038] “ . . . FIG. 4 is a diagram of various types of intelligent entities 120 according to one embodiment. Example types of intelligent entities 120 include a Fault IE, Capacity IE, Performance IE, Security IE, Inventory IE, and Alarm IE. The intelligent entities 120 may communicate with various management components (e.g., fault management, capacity management, etc.) of the OSS management and orchestration 410 to perform different types of tasks. Particularly, the Fault IE may identify and determine solutions to issues for fault management. The Capacity IE may track workload or resources of computing devices 130 and predict when additional capacity should be allocated to support an increase in demand. The Performance IE may monitor performance metrics of computing devices 130 such as latency, memory usage, CPU usage, network bandwidth, etc. The Security IE protects the infrastructure 400 from unauthorized activity and may detect anomalies in the system. The Inventory IE manages inventory of the computing devices 130 or other components in the system. The Alarm IE generates and transmits alarms responsive to determining that a given event has occurred (e.g., commissioning or decommissioning of a computing device 130) or that a certain condition has been satisfied (e.g., resource usage has reached at least a threshold level of capacity). . .”) ; identifying an amount of spare resources that are available ( see ¶ [0043] “ . . .  responsive to determining a negative impact or reduction in available resources, the Predictive Auto-scaling IE determines to commission additional resources or turn off existing virtual machines or computing devices 130 to release lower priority or unused resources. In some embodiments, the Predictive Auto-scaling IE may check with another IE before performing an action. For example, the Predictive Auto-scaling IE checks with a Social Media IE to determine whether resource capacity should be maintained for an upcoming social event . . .”) , wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node (e.g. computing device) (see ¶ [0051] “ . . . The resource tracker 712 and provisioning module 714 operate on the control plane of the network 180. The resource tracker 712 monitors the resources 718 of the computing device 130. The resource tracker 712 may track, for example, current and historical demand for the resources 718, assignments of service components to the resources 718, performance requirements of service components, or characteristics of the resources 718. Types of the characteristics may include compute characteristics (e.g., CPU type, number of CPUs, CPU speed or latency, etc.), storage characteristics (e.g., volatile or non-volatile memory, storage volume in gigabytes or terabytes, read and write latency, etc.), networking characteristics (e.g., number of interfaces and network speed), node geographical location (e.g., jurisdiction, country, or longitude and latitude coordinates of the computing devices), node connectivity (e.g., nearby computing devices, connection speed, etc.), and access connectivity (e.g., fiber connection, radiofrequency access, spectrum, bandwidth, cell identification, etc.), among other characteristics. The resource tracker 712 may provide the tracked resource information to an orchestrator 140 or the provisioning module 714. . . .);
allocating the identified amount of spare resources (e.g. available resources) for training the machine learning model of the fog node (see ¶ [0053] “ . . . FIG. 8 is a flow chart illustrating a process 800 for performing predictions using intelligent entities, according to one embodiment. In an example use case, an intelligent element framework 110 chains a Traffic IE, User Activity IE, Resource Prediction IE, and Recommendation IE to perform energy management. The Traffic IE determines trends 810 in resource demand. The Traffic IE may use a model trained using one or more features to determine trends. For instance, the features indicate metrics associated with available resources or previous resource demand, e.g., a growth or decrease in demand using historical traffic data. Accordingly, the Traffic IE may learn to predict that similar or different trends may occur in the future given certain conditions. In some embodiments, the features are generated by at least one other intelligent entity 120. . . . “); and
training the machine learning model of the fog node using the identified amount of spare resources (see ¶ [0054] “ . . . The User Activity IE predicts user activity 820, e.g., using social media information or historical user movement. The User Activity IE may also use a model trained using one or more features to generate predictions of user activity. The model used by the User Activity IE may be different than a model used by the Traffic IE. Generally, intelligent entities 120 may use different models from each other, models trained with different training data, or models trained using different machine learning algorithms. The User Activity IE may predict user activity such as usage levels of computing devices 130, periods of time with relatively greater or less traffic on a network, locations to which users are likely to travel, or aggregate activity from a population of user . . . “).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the applicant’s application to incorporate a method, system and product for automating computer devices through the cloud using machine learning training, that can be updated, as taught by Ghibril, into a method, system and product for training local devices using pre-classified training data from cloud servers, as taught by Chew.  Such incorporation enables devices configured as fog devices to be trained to react to changes in environment or resource allocation.
The combination of Chew and Ghibril fails to explicitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer;  However Satou teaches the machine learning model (see Fig. 6, ¶  ¶ [0060-0062] “ . . . As illustrated in FIG. 6, the following second to fourth embodiments assume a system in which a plurality of devices are logically divided into three layers: a layer containing a cloud server 6 and the like, a layer containing a fog computer 7 and the like, and a layer containing an edge computer 8 (such as a robot controller and the controller included in a cell 9) in a state where each of the plurality of devices is connected to a network.  In such a system, the controller 1 according to an embodiment of the present invention can be implemented on any of the cloud server 6, the fog computer 7, and the edge computer 8, so that data for use in machine learning can be shared among the plurality of devices via the network for distributed learning, the generated learning model can be collected in the fog computer 7 and the cloud server 6 for large-scale analysis, and further the generated learning model can be mutually reused.  In the system illustrated in FIG. 6, a plurality of cells 9 are provided in a factory in various places and a fog computer 7 located in the upper layer manages each cell 9 in a predetermined unit (such as in units of factories and in units of a plurality of factories of the same manufacturer). The data collected and analyzed by these fog computers 7 are further collected and analyzed by the cloud server 6 in the upper layer, and the information obtained as the result can be used for control and the like by each edge computer 8  being used to optimize performance (see Fig. 9, ¶  ¶ [0074-0075] “ . . A control system 500′ of the present embodiment comprises at least one machine learning device 100 (illustrated as an example implemented as a part of the fog computer 7 in FIG. 9) implemented as a part of a computer such as a cloud server, a host computer, and a fog computer, a plurality of controllers 1″, and the network 5 connecting these controllers 1″ and the computer to each other. Note that the hardware configuration of the computer is the same as the schematic hardware configuration of the controller 1′ illustrated in FIG. 7 such that the hardware components such as the CPU 311, the RAM 313, and the non-volatile memory 314 provided in a general computer are connected through the bus 320.  In the control system 500′ having the aforementioned configuration, based on the state variable S and the determination data D obtained from each of the plurality of controllers 1″, the machine learning device 100 learns the adjustment of the control command of the manipulator in the industrial robot 2 common to all the controllers 1″, and then by using the learning result, can perform the adjustment of the control command of the manipulator in each industrial robots 2. According to the configuration of the control system 500′, when needed, the necessary number of controllers 1″ can be connected to the machine learning device 100 regardless of where and when each of the plurality of controllers 1″ exists. . . “) of at least the fog node at the fog layer (see Fig. 10, ¶ [0082] “ . . . An example of operation of the control system 500″ according to the present embodiment may be such that the machine learning device 100′ is arranged on the fog computer 7 installed for a plurality of controllers 1 as the edge computer, the learning model generated by each controller 1 is collected by and stored in the fog computer 7, optimization or streamlining is performed based on a plurality of stored learning models, and then the optimized or streamlined learning model is redistributed to each controller 1 as needed. . . .”); 
It would have been obvious to one with ordinary skill in the art before the effective filing date of the applicant’s application to incorporate a method, system and product to enable a control system to manipulate a robot device, the control system comprises a layer of fog computers in communication with edge devices used for the control of the robot, the fog computer includes a machine learning device that learns optimal adjustments to operate the controller of the robot, and the operational performance of the entire control system including the fog computer is optimized using the knowledge of the  learning models that was generated, as taught by Satou, into a method, system and product for training local devices using pre-classified training data from cloud servers,  and then automatically updating the training as necessary, as taught by the combination of Chew and Ghibril.  Such incorporation enables an optimal machine model on a fog device to be a baseline for operation.
In regard to claim 2,  the combination of Chew, Ghibril, and Satou teaches wherein the monitoring step is performed by a system performance monitor that is installed on the fog node (see Ghibril Fig. 1, Fig. 4, ¶ [0038] “ . . . The Performance IE may monitor performance metrics of computing devices 130 such as latency, memory usage, CPU usage, network bandwidth, etc. The Security IE protects the infrastructure 400 from unauthorized activity and may detect anomalies in the system. The Inventory IE manages inventory of the computing devices 130 or other components in the system. The Alarm IE generates and transmits alarms responsive to determining that a given event has occurred (e.g., commissioning or decommissioning of a computing device 130) or that a certain condition has been satisfied (e.g., resource usage has reached at least a threshold level of capacity). . . “).
The motivation to combine the references is described for the rejection of claim 1 and is incorporated herein.  Additionally, Ghibril offers performance monitoring on each device.  
In regard to claim 3, the combination of Chew, Ghibril, and Satou teaches wherein the training of the machine learning model is performed using a sampled data set (see Chew ¶ [0016] “ . . . The analytics engine may pre-trained through supervised machine learning using a set of predefined training data, such as pre-classified images, data patterns, and the like. . . “)
In regard to claim 4, the combination of Chew, Ghibril, and Satou teaches wherein a sampling (e.g. types of input) is based on the amount of spare resources available to the fog node (see Ghibril Fig. 8, ¶ [0055] “ . . . The Traffic IE and User Activity IE may provide their outputs, resource trends and user activity predictions, respectively, to the Resource Prediction IE. In particular, the outputs may be provided via the intelligent element framework 110 that chains the Traffic IE and User Activity IE to the Resource Prediction IE, e.g., using one or more channels. The Resource Prediction IE predicts resource demand 830 for a cell (e.g., a computing device 130) using the input from the other IEs. The Resource Prediction IE may also receive other types of input such as weather information, news, or events for generating predictions regarding the cell. In addition, the Resource Prediction IE determines current resource utilization 840. The Resource Prediction IE provides its outputs to the Recommendation IE . . .”).
The motivation to combine the references is described for the rejection of claim 1 and is incorporated herein.  Additionally, Ghibril uses data from input data sets particular to resources available to the device.
In regard to claim 5, the combination of Chew, Ghibril, and Satou teaches wherein the training of the machine learning model is performed so long as spare resources are available (e.g. cyclic based on chained feedback) at the fog node (see Ghibril Fig. 8, ¶ [0057] “ . . . The process 800 as described above is performed using intelligent entities 120 chained in series as part of a directed graph. In other embodiments, the intelligent entities 120 may be part of a cyclic graph where at least one of the IEs uses feedback from another IE. For instance, the Resource Prediction IE predicts resource demand using previous recommendations determined by the Recommendation IE. The intelligent element framework 110 or an IE may determine a quality of output generated by another IE for the feedback. . . “).
The motivation to combine the references is described for the rejection of claim 1 and is incorporated herein.  Additionally, Ghibril provides a mechanism for continuing a process to train a device so long as there are available resources.
In regard to claim 6, the combination of Chew, Ghibril, and Satou teaches further comprising automatically terminating (e.g. turn off) a current training session of the machine learning model when the spare resources are no longer available at the fog node (see Ghibril Fig. 8, ¶ [0056] “ . . . The Recommendation IE determines a recommendation 850 to turn on or off management components of the cell. For example, responsive to a prediction that resource demand and user activity is predicted to decrease during a given time period (e.g., the weekend), the Recommendation IE recommends to turn off at least a portion of the cell to preserve energy. In some embodiments, the Recommendation IE may determine other types of recommendations, for example, requesting an intervention to mitigate an identified fault or alert, commissioning new management components or reconfiguring existing management components, or triggering other artificial intelligence tasks . . .”).
The motivation to combine the references is described for the rejection of claim 1 and is incorporated herein.  Additionally, Ghibril provides the ability to turn off training in certain circumstances. 
In regard to claim 11,  Chew teaches a non-transitory computer-readable medium comprising instructions ( see  ¶ [0119] “ . . . FIG. 7 is a block diagram of an exemplary non-transitory, machine readable medium 700 including code to direct a processor 702 to access a cloud server for training in accordance with some embodiments. The processor 702 may access the non-transitory, machine readable medium 700 over a bus 704. The processor 702 and bus 704 may be selected as described with respect to the processor 602 and bus 606 of FIG. 6. The non-transitory, machine readable medium 700 may include devices described for the mass storage 608 of FIG. 6, or may include optical disks, thumb drives, or any number of other hardware devices . . .”)  for training a fog node in a fog layer(see ¶ [0049] “ . . .  FIG. 3 is a drawing of a computing network300, in which the cloud 202 is in communication with a mesh network of IoT devices, which may be termed a fog device 302, operating at the edge of the cloud 202 in accordance with some embodiments. Like numbered items are as described with respect to FIG. 2. As used herein, a fog device 302 is a cluster of devices that may be grouped to perform a specific function, such as providing content to active signs, traffic control, weather control, plant control, home monitoring, and the like . . .”), the instructions, when executed by a computing system ( see  ¶ [0120] “ . . .  the non-transitory, machine readable medium 700 may include code 706 to direct the processor 702 to analyze data from a sensor to determine a classification or a prediction. Code 708 may be included to direct the processor 702 to calculate a confidence level for the prediction. Code 710 may be included to direct the processor 702 to send the data to a cloud server for analysis by an analytics engine hosted by the cloud server. . . “), cause the computing system to (see ¶ [0044] “ . . . The techniques provide a method to enable an analytics engine hosted by cloud server to train a less powerful analytics engine hosted by an edge server . . .”):
receive a machine learning model for the fog node from a cloud (see ¶¶ [0041-0042] as described for the rejection of claim 1 and is incorporated herein), wherein the cloud provides the machine learning model with its initial training (see Fig. 4, ¶ [0082] as described for the rejection of claim 1 and is incorporated herein).
Chew fails to expclitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer; monitor resources being used at the fog node, wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node; identify an amount of spare resources that are available, wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node; allocate the identified amount of spare resources for training the machine learning model of the fog node; and train the machine learning model of the fog node using the spare resources.
However Ghibril  teaches monitor resources being used at the fog node (see ¶ [0031] as described for the rejection of claim 1 and is incorporated herein), wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node(see Fig. 4 see ¶ [0038] as described for the rejection of claim 1 and is incorporated herein);
identify an amount of spare resources that are available( see ¶ [0043] as described for the rejection of claim 1 and is incorporated herein), wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node (e.g. computing device) (see ¶ [0051] as described for the rejection of claim 1 and is incorporated herein);
allocate the identified amount of spare resources (e.g. available resources)  for training the machine learning model of the fog node (see ¶ [0053] as described for the rejection of claim 1 and is incorporated herein); and
train the machine learning model of the fog node using the spare resources (see ¶ [0054] as described for the rejection of claim 1 and is incorporated herein).
The motivation to combine Ghibril with Chew is described for the rejection of claim 1 and is incorporated herein.
The combination of Chew and Ghibril fails to explicitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer;  However Satou teaches the machine learning model (see Fig. 6, ¶  ¶ [0060-0062] as described for the rejection of claim 1 and is incorporated herein)  being used to optimize performance (see Fig. 9, ¶  ¶ [0074-0075] as described for the rejection of claim 1 and is incorporated herein)) of at least the fog node at the fog layer (see Fig. 10, ¶ [0082] as described for the rejection of claim 1 and is incorporated herein).
The motivation to combine Satou with the combination of Chew and Ghibril is described for the rejection of claim 1 and is incorporated herein.
In regard to claim 12, the combination of Chew, Ghibril, and Satou teaches wherein the instructions further cause the computing system to control a sampling of data being used to train the machine learning model (see Chew ¶ [0016] as described for the rejection of claim 3 and is incorporated herein).
In regard to claim 13, the combination of Chew Ghibril, and Satou teaches wherein the instructions to control the sampling of data being used to train the machine learning model is based on the amount of spare resources available at the fog node (see Ghibril Fig. 8, ¶ [0055] as described for the rejection of claim 4 and is incorporated herein).
The motivation to combine the references is described for claim 4 and is incorporated herein.
In regard to claim 14, the combination of Chew, Ghibril, and Satou teaches wherein the instructions further monitor the fog node to identify when spare resources are no longer available  (see Ghibril Fig. 8, ¶ [0057] as described for the rejection of claim 5 and is incorporated herein).
The motivation to combine the references is described for claim 5 and is incorporated herein.
In regard to claim 15, the combination of Chew Ghibril, and Satou teaches wherein the instructions further terminate a current training session of the machine learning model based on the identification that the spare resources are no longer available at the fog node  (see Ghibril Fig. 8, ¶ [0056] as described for the rejection of claim 6 and is incorporated herein).
The motivation to combine the references Ghibril with Chew is described for claim 6 and is incorporated herein.
In regard to claim 16, Chew teaches a system (see Fig.2, ¶ [0034] “ . . . FIG. 2 is a drawing of a computer network 200 in which a cloud 202 is in communication with a number of Internet of Things (IoT) devices in accordance with some embodiments . . .”) for training a fog node in a fog layer(see ¶ [0049] “ . . .  FIG. 3 is a drawing of a computing network300, in which the cloud 202 is in communication with a mesh network of IoT devices, which may be termed a fog device 302, operating at the edge of the cloud 202 in accordance with some embodiments. Like numbered items are as described with respect to FIG. 2. As used herein, a fog device 302 is a cluster of devices that may be grouped to perform a specific function, such as providing content to active signs, traffic control, weather control, plant control, home monitoring, and the like . . .”), the system comprising(see ¶ [0021] “ . . . The techniques described herein provide a method for an analytics engine hosted by a cloud based server to assist in training and analytics engine hosted by an edge server . . .”): 
a processor (see ¶ [0100] “ . . The edge server 600 may include a processor 602, which may be a microprocessor, a multi-core processor, a multithreaded processor, an ultra-low voltage processor, an embedded processor, or other known processing element. The processor 602 may be a part of a system on a chip (SoC) in which the processor 602 and other components are formed into a single integrated circuit, or a single package, such as the Edison™ or Galileo™ SoC boards from Intel . . .”); and 
a non-transitory computer-readable medium storing instructions that ( see  ¶ [0119] “ . . . FIG. 7 is a block diagram of an exemplary non-transitory, machine readable medium 700 including code to direct a processor 702 to access a cloud server for training in accordance with some embodiments. The processor 702 may access the non-transitory, machine readable medium 700 over a bus 704. The processor 702 and bus 704 may be selected as described with respect to the processor 602 and bus 606 of FIG. 6. The non-transitory, machine readable medium 700 may include devices described for the mass storage 608 of FIG. 6, or may include optical disks, thumb drives, or any number of other hardware devices . . .”), when executed by the system( see  ¶ [0120] “ . . .  the non-transitory, machine readable medium 700 may include code 706 to direct the processor 702 to analyze data from a sensor to determine a classification or a prediction. Code 708 may be included to direct the processor 702 to calculate a confidence level for the prediction. Code 710 may be included to direct the processor 702 to send the data to a cloud server for analysis by an analytics engine hosted by the cloud server. . . “), cause the system to (see ¶ [0044] “ . . . The techniques provide a method to enable an analytics engine hosted by cloud server to train a less powerful analytics engine hosted by an edge server . . .”): 
receiving, from a cloud, an initially trained machine learning model (see Fig. 4, ¶ [0082] as described for the rejection of claim 1 and is incorporated herein) for a fog node within a fog layer(see ¶¶ [0041-0042] as described for the rejection of claim 1 and is incorporated herein); 
Chew fails to expclitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer; monitor resources being used at the fog node, wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node; identify an amount of spare resources that are available, wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node; allocate the identified amount of spare resources for training the machine learning model of the fog node; and train the machine learning model of the fog node using the identified amount of spare resources.
However Ghibril  teaches monitor resources being used at the fog node (see ¶ [0031] as described for the rejection of claim 1 and is incorporated herein), wherein a threshold amount of resources is identified as being an amount of resources needed for normal operations of the fog node(see Fig. 4 see ¶ [0038] as described for the rejection of claim 1 and is incorporated herein);
identify an amount of spare resources that are available( see ¶ [0043] as described for the rejection of claim 1 and is incorporated herein), wherein the amount of spare resources corresponds to a difference between a current amount of resources being used that is less than the amount of resources needed for normal operations of the fog node (e.g. computing device) (see ¶ [0051] as described for the rejection of claim 1 and is incorporated herein);
allocate the identified amount of spare resources (e.g. available resources)  for training the machine learning model of the fog node (see ¶ [0053] as described for the rejection of claim 1 and is incorporated herein); and
train the machine learning model of the fog node using the spare resources (see ¶ [0054] as described for the rejection of claim 1 and is incorporated herein).
The motivation to combine Ghibril with Chew is described for the rejection of claim 1 and is incorporated herein.
The combination of Chew and Ghibril fails to explicitly teach the machine learning model being used to optimize performance of at least the fog node at the fog layer;  However Satou teaches the machine learning model (see Fig. 6, ¶  ¶ [0060-0062] as described for the rejection of claim 1 and is incorporated herein)  being used to optimize performance (see Fig. 9, ¶  ¶ [0074-0075] as described for the rejection of claim 1 and is incorporated herein)) of at least the fog node at the fog layer (see Fig. 10, ¶ [0082] as described for the rejection of claim 1 and is incorporated herein).
The motivation to combine Satou with the combination of Chew and Ghibril is described for the rejection of claim 1 and is incorporated herein.
In regard to claim 17, the combination of Chew, Ghibril, and Satou  teaches wherein the instructions further cause the system to control a sampling of data being used to train the machine learning model(see Chew ¶ [0016] as described for the rejection of claim 3 and is incorporated herein).
In regard to claim 18, the combination of Chew, Ghibril, and Satou  teaches wherein the controlling is based on the amount of spare resources available at the fog node (see Ghibril Fig. 8, ¶ [0055] as described for the rejection of claim 4 and is incorporated herein).
The motivation to combine the references is described for claim 4 and is incorporated herein..
In regard to claim 19, the combination of Chew, Ghibril, and Satou teaches wherein the instructions further cause the system to monitor the fog node to identify when spare resources are no longer available (see Ghibril Fig. 8, ¶ [0057] as described for the rejection of claim 5 and is incorporated herein).
The motivation to combine the references is described for claim 5 and is incorporated herein.
In regard to claim 20, the combination of Chew, Ghibril, , and Satou  teaches wherein the instructions further cause the system to terminate a current training session of the machine learning model based on the identification that the spare resources are no longer available at the fog node  (see Ghibril Fig. 8, ¶ [0056] as described for the rejection of claim 6 and is incorporated herein).
The motivation to combine the references is described for claim 6 and is incorporated herein.
Claims 7 – 8 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Chew (U.S. 2018/0285767 A1; herein referred to as Chew) in view of Ghibril et al. (U.S. 2019/0164087 A1; herein referred to as Ghibril) in further view of Satou (U.S. 2020/0101603 A1; herein referred to as Satou) as applied to claims 1 – 6  and 11 – 20 in further view of Prakash et al. (U.S. 2019/0138934 A1; herein referred to as Prakash).
In regard to claim 7, the combination of Chew, Ghibril, and Satou fails to explicitly teach further comprising automatically terminating future training sessions and completing current training sessions of the machine learning model when an amount of spare resources available at the fog node falls below a pre-determined threshold amount.  However Prakash teaches further comprising automatically terminating future training sessions (see ¶ [0044] “ . . . When all selected edge nodes 101, 201 that were connected to a specific instance of the training process (model) β have disconnected, the instance of the training process (model) β may be terminated. . . “) and completing current training sessions of the machine learning model (see ¶ [0043] “ . . . For edge-cloud ML or distributed learning, ML training is performed on a dataset to learn parameters of an underlying model β, where the dataset and computational tasks of the ML training process are distributed across a plurality of edge nodes 101, 201. . . By off-loading ML training tasks to individual edge nodes 101, 201, the ML training process may be accelerated and/or may provide a more efficient use of computational resources . . ) when an amount of spare resources available at the fog node (see ¶ [0048] “ . . . he operational parameters of the edge compute nodes 101, 201 includes compute node capabilities and operational constraints or contexts. The compute node capabilities may include, for example, configuration information (e.g., a hardware platform make and model, hardware component types and arrangement within the hardware platform, associated peripheral and/or attached devices/systems, processor architecture, currently running operating systems and/or applications and/or their requirements, subscription data (e.g., data plan and permissions for network access), security levels or permissions (e.g., possible authentication and/or authorization required to access the edge compute node 101, 201), etc.); computational capacity (e.g., a total processor speed of one or more processors, a total number of VMs capable of being operated by the edge compute node 101, 201, a memory or storage size, an average computation time per workload, a reuse degree of computational resources, etc.); current or predicted computational load and/or computational resources (e.g., processor utilization or occupied processor resources, memory or storage utilization, etc.); current or predicted unoccupied computational resources (e.g., available or unused memory and/or processor resources, available VMs, etc.); network capabilities (e.g., link adaptation capabilities, configured and/or maximum transmit power, achievable data rate per channel usage, antenna configurations, supported radio technologies or functionalities of a device (e.g., whether a UE 101 supports Bluetooth/BLE; whether an (R)AN node 111 supports LTE-WLAN aggregation (LWA) and/or LTE/WLAN Radio Level Integration with IPsec Tunnel (LWIP), etc.), subscription information of particular UEs 101, etc.); energy budget (e.g., battery power budget); and/or other like capabilities . . .”) falls below a pre-determined threshold amount (see ¶ [0049] “ . . . the operational contexts and/or constraints may be based on a pre-assessment of an operational state of the edge compute nodes 101, 102, which may be based on previously indicated operational contexts and/or constraints for different offloading opportunities. This may involve, for example, evaluating both computation and communication resources needed for different offloading opportunities. The threshold criteria or a desired level of reliability mentioned previously may be based on a certain amount or type of compute node capabilities (e.g., a certain processor speed) and/or a type of operational constraints under which the compute node is operating (e.g., a desired link quality, a desired surrounding temperature, a desired processor temperature, etc.) . . .”).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the applicant’s application to incorporate a method, system and product to manage distributed machine learning (ML) training using heterogeneous compute nodes in a heterogeneous computing environment, where the heterogeneous compute nodes are connected to a master node via respective wireless links. ML computations are performed by individual heterogeneous compute nodes on respective training datasets, and a master combines the outputs of the ML computations obtained from individual heterogeneous compute nodes, as taught by Prakash, into a method, system and product for training local devices using pre-classified training data from cloud servers,  and then automatically updating the training as necessary, using fog devices to optimize the training models as taught by the combination of Chew, Ghibril, and Satou.  Such incorporation provides a criteria for administrating the training based on constraints and thresholds of the devices and the resources available. 
In regard to claim 8, the combination of Chew, Ghibril, Satou, and Prakash teaches  wherein the training of the machine learning model is performed so long as a pre-determined threshold amount of spare resources are available at the fog node (see Prakash ¶ [0045] “ . . . define criteria to be used by the MEC system 200 for determining threshold criteria or a desired level of reliability for selecting a particular edge compute node 101, 201 to perform computational tasks β. In this example, the threshold criteria may be based on a desired epoch time for computing a full gradient from obtained partial gradients from each edge compute node 101, 201. In another example, the load balancing policy may define criteria (e.g., load allocation criteria) to be used by the MEC system 200 for determining how to partition the training data into different datasets x.sub.1-x.sub.m. . . “).
The motivation to combine Prakash with the combination of Chew, Ghibril, and Satou is described for the rejection of claim 7.  Additionally, Prakash provides different criteria of threshold values necessary to train an edge fog node.  
In regard to claim 10, the combination of Chew, Ghibril, Satou and Prakash teaches wherein the machine learning model (see Prakash¶ [0004] “ . . .  Many forms of machine learning (ML), such as supervised learning, perform a training process on a relatively large dataset to estimate an underlying ML model. Linear regression is one type of supervised ML algorithm that is used for classification, stock market analysis, weather prediction, and the like. Gradient descent (GD) algorithms are often used in linear regression. Given a function defined by a set of parameters, a GD algorithm starts with an initial set of parameter values, and iteratively moves toward a set of parameter values that minimize the function. This iterative minimization is achieved by taking steps in the negative direction of the function gradient. Example use cases for GD algorithms include localization in wireless sensor networks and distributed path-planning for drones. . .”) received from the cloud is an initially trained machine learning model from the cloud (see Prakash¶ [0021] “ . . . The present disclosure is related to distributed machine learning (ML) in distributed heterogeneous computing environments, where computational resources of multiple edge compute nodes are utilized for collaborative learning for an underlying ML model. Distributed heterogeneous computing environments are computing environments where compute (processing) and storage resources are available at multiple edge compute nodes, with varying capabilities and operational constraints. Generally, an ML algorithm is a computer program that learns from experience with respect to some task and some performance measure, and an ML model may be any object or data structure created after an ML algorithm is trained with one or more training datasets. After training, an ML model may be used to make predictions on new datasets. Although the term “ML algorithm” refers to different concepts than the term “ML model,” these terms as discussed herein may be used interchangeably for the purposes of the present disclosure. . .”) that uses default values or past machine learning models of similar fog nodes stored within the cloud  (see Prakash ¶ [0099] “ . . . Gradient descent (GD) is an optimization algorithm used to minimize a target function by iteratively moving in the direction of a steepest descent as defined by a negative of the gradient. An objective of GD in machine learning (ML) is to utilize a training dataset D in order to accurately estimate the unknown model β over one or more epochs r. In ML, GD is used to update the parameters of the unknown model β. Parameters refer to coefficients in linear regression and weights in a neural network. These objectives are realized in an iterative fashion by computing β.sup.(r) at the r-th epoch, and evaluating a gradient associated with the squared-error cost function defined by f (β.sup.(r))=∥Xβ.sup.(r)−Y∥.sup.2. The cost function indicates how accurate the model β is at making predictions for a given set of parameters. The cost function has a corresponding curve and corresponding gradients, where the slope of the cost function curve indicates how the parameters should be changed to make the model β more accurate. In other words, the model β is used to make predictions, and the cost function is used to update the parameters for the model β.  . . .”).
The motivation to combine Prakash with the combination of Chew, Ghibril, and Satou  is described for the rejection of claim 7.  Additionally, Prakash provides access to several machine models that can be used to train the fog devices.
  Conclusion
There are prior art made of record which are not relied upon but are considered pertinent to applicant’s disclosure.  They are listed on the PTO-892 accompanying this action.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES N FIORILLO whose telephone number is (571)272-9909.  The examiner can normally be reached on 7:30 - 5 PM Mon - Fri..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John A. Follansbee can be reached on 571-272-3964.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JAMES N FIORILLO/Examiner, Art Unit 2444