DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
	
Status of the Application
2.	Claims 1-20 are pending in this application (16/588,913), as Applicant has filed a Request for Continued Examination (RCE) under 37 CFR 1.114 on 09/06/2022, following the Final Rejection office action dated 06/01/2022.    
	Claims 1-20 have been amended. 
	(Please see Claims in page 2-7 of Applicant Arguments/Remarks, filed on 08/12/2022)
	Applicant's submissions have been entered.


Information Disclosure Statement
3.	Applicant’s Information Disclosure Statement (IDS), filed on 09/06/2022, have been received and entered into the record. The references cited therein have been considered by the Examiner. See attached PTO-1449 form(s).

			Withdrawal of 35 U.S.C, § 112(f) Interpretation 
4.	Previously noted 35 U.S.C, § 112(f) Interpretation of Claim 1 are hereby withdrawn based on Applicant’s amendments and arguments made.in this regard in page 8 of Applicant Arguments/Remarks filed on 08/12/2022.

Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimedinvention is not identically disclosed as set forth in section 102 of this title, if the differencesbetween the claimed invention and the prior art are such that the claimed invention as a wholewould have been obvious before the effective filing date of the claimed invention to a personhaving ordinary skill in the art to which the claimed invention pertains. Patentability shall notbe negated by the manner in which the invention was made. 

6.	Claims 1-20 are rejected, under AIA  35 U.S.C. 103, as being un-patentable by FAULHABER, JR. et al. (US 2019/0156247 A1, Pub. Date: May 23, 2019; Filed: Mar. 13, 2018; hereinafter FAULHABER) [cited by Applicant as a prior art in IDS filed on 04/29/2021], in view of Sobol et al. (US 2019/0209022 A1; Pub. Date:  Jul. 11, 2019; Filed: Dec. 27, 2018; hereinafter Sobol).

Regarding claim 1, FAULHABER teaches: 
(Currently amended) A system (See, e.g., FAULHABER, FIG. 1; par [0025]: “…FIG. 1 is a block diagram illustrating an environment for dynamic accuracy-based experimentation and deployment of machine learning models in provider networks according to some embodiments. …”  Also see, e.g., FAULHABER, FIG. 11: COMPUTER SYSTEM 1100; pars [0138]-[0144]: “…a system that implements a portion or all of the techniques for dynamic accuracy-based deployment and monitoring of ML models as described herein may include a general-purpose computer system that includes or is configured to access one or more computer-accessible media, such as computer system 1100 illustrated in FIG. 11. …” (emphasis added) Examiner Note (EN):  FAULHABER discloses: FIG. 1 is a block diagram illustrating an environment for dynamic accuracy-based experimentation and deployment of machine learning models, using a computer system 1100 illustrated in FIG. 11.), comprising: 

one or more computing devices comprising one or more processors (See, e.g., FAULHABER, FIG. 11; pars [0138]-[0139]:  “…computer system 1100 may include one computing device or any number of computing devices configured to work together as a single computer system 1100
… computer system 1100 may be a uniprocessor system including one processor 1110, or a multiprocessor system including several processors 1110 (e.g., two, four, eight, or another suitable number). Processors 1110 may be any suitable processors capable of executing instructions…” (emphasis added) EN:  FAULHABER discloses: computer system 1100 includes one computing device or any number of computing devices configured to work together as a single computer system 1100 that may be a uniprocessor system including one processor 1110, or a multiprocessor system including several processors 1110.) configured to implement a machine learning training cluster (See, e.g., FAULHABER, FIG. 1; par [0030]:  “…multiple ML models 118A-118N can be configured as part of a group 116 of models--e.g., multiple models trained to perform a same "type" of inference…” EN:  FAULHABER discloses: multiple ML [machine learning] models 118A-118N can be configured as part of a group [cluster] 116 of models.), wherein the machine learning training cluster, using the one or more processors, is configured to: 

train a machine learning model (See, e.g., FAULHABER, FIG. 1; par [0030]:  “…multiple ML models 118A-118N can be configured as part of a group 116 of models--e.g., multiple models trained to perform a same "type" of inference…” EN:  FAULHABER discloses: multiple ML models trained to perform a same "type" of inference.); and

collect data produced from the training of the machine learning model on the one or more computing devices, wherein the data produced from the training of the machine learning model is collected using agent software on the one or more computing devices (See, e.g., FAULHABER, FIG. 2; par [0036]:  “…the analytics engine 122 can interact with a ground truth collector 124 to obtain ground truth for a set of requests, and compare this obtained ground truth with the inference results generated by the ML model(s) 118A-118C under scrutiny to identify the true accuracy of these models.”  EN:  FAULHABER teaches: ground truth collector 124 [agent software] to obtain [collect] ground truth for a set of requests, and compare this obtained ground truth with the inference results generated by the ML model(s) 118A-118C under scrutiny to identify the true accuracy of these models.), and …; and

one or more computing devices comprising one or more processors (See, e.g., FAULHABER, FIG. 11; pars [0138]-[0139]:  “…computer system 1100 may include one computing device or any number of computing devices configured to work together as a single computer system 1100
… computer system 1100 may be a uniprocessor system including one processor 1110, or a multiprocessor system including several processors 1110 (e.g., two, four, eight, or another suitable number). Processors 1110 may be any suitable processors capable of executing instructions. .…” (emphasis added) EN:  FAULHABER discloses: computer system 1100 includes one computing device or any number of computing devices configured to work together as a single computer system 1100 that may be a uniprocessor system including one processor 1110, or a multiprocessor system including several processors 1110.) configured to implement a machine learning analysis system (See, e.g., FAULHABER, FIG. 1; par [0031]:  “The machine learning service 140, which may execute the group 116 of ML models 118A-118N (e.g., within a model hosting system), in some embodiments includes a dynamic router 108 and an analytics engine 122.…” (emphasis added)  EN:  FAULHABER discloses: The machine learning service 140 includes an analytics engine 122.), wherein the machine learning analysis system, using the one or more processors, is configured to:
detect one or more problems from the training of the machine learning model based at least in part on the analysis of the aggregated data (See, e.g., FAULHABER, FIG. 1; par [0034]:  “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …” (emphasis added)  Also see, e.g., FAULHABER, par [0068]: “…the dashboard could be interactive and allow a user to change how traffic is passed to models, add models to groups, pull models out of groups, etc., and can also have alarming and alerting (e.g., to indicate that a model is not performing well). …” (emphasis added)  EN:  FAULHABER teaches: determine, using data 136 that includes input data 134 and inference results 142 generated by the ML models 118, the quality of the inferences of the ML model(s) 118 that may indicate that a model is not performing well.); and 
generate one or more alarms describing the one or more problems from the training of the machine learning model (See, e.g., FAULHABER, FIG. 1; par [0034]:  “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …” (emphasis added)  Also see, e.g., FAULHABER, par [0068]:  “…the dashboard could be interactive and allow a user to change how traffic is passed to models, add models to groups, pull models out of groups, etc., and can also have alarming and alerting (e.g., to indicate that a model is not performing well).” (emphasis added)  EN:  FAULHABER teaches: alarming and alerting (e.g., to indicate that a model is not performing well), using data 136 that includes input data 134 and inference results 142 generated by the ML models 118.).
While FAULHABER (e.g., FIG. 1; par [0034]: “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …”) teaches: data produced from the training of the machine learning model,

FAULHABER does not appear to explicitly teach: 
wherein the data [produced from the training of the machine learning model] comprises tensor- level numerical values; and  
aggregate the data [produced from the training of the machine learning model];
 perform an analysis of aggregated data [produced from the training of the machine learning model], wherein the analysis of the aggregated data comprises evaluation of one or more rules; 

However, Sobol (US 2019/0209022 A1), in an analogous art of training machine learning models, teaches:
wherein the data [produced from the training of the machine learning model] comprises tensor- level numerical values (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 …” (emphasis added)  Also see, e.g., Sobol, FIG. 2F: MEMORY 173B, par [0249]: “Such a model, as well as its corresponding algorithmic tools, may include configuring processor 173A to execute programmed software instructions by using a predefined set of machine code 173E such that the neural network which may receive data from the various sensors 121 in the form of various nodes 2100A, 2100B, 2100C .  . . 2100N of the input layer 2100, where each of these input nodes 2100A, 2100B, 2100C .  . . 2100N can store its corresponding input data value within a particular location of memory 173B.”  EN:  Sobol teaches: Within the machine learning context, data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector", wherein an instance is an observation of the data being collected, and further defined with an attribute that is a specific numerical value of that particular instance, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121.); and  

aggregate the data [produced from the training of the machine learning model] (See, e.g., Sobol, FIG. 2F, par [0164]:  “…the sensors 121 may act in conjunction with one another--as well as with instructions that are stored on a machine-readable medium such as memory 173B--to aggregate (or fuse) the acquired data in order to infer certain activities, conditions or circumstances. …”  EN:  Sobol teaches: aggregate (or fuse) the acquired data.).

perform an analysis of aggregated data [produced from the training of the machine learning model], wherein the analysis of the aggregated data comprises evaluation of one or more rules (See, e.g., Sobol, FIG. 6, par [0292]:  “…the system 1 may be configured to analyze the significance of the data either with intra-patient or inter-patient baseline data 1700, including for algorithmic training where a data set may be stored in local or remote memory 173B that contains classified or labeled examples with known instances of location, movement or other useful biometric measures of individual activity.  This training data set 16010 may be input into one or more of the machine learning algorithms discussed herein such that once the algorithm is optimized through validation and testing of respective data sets 1620, 1630, a suitable classification rule for use in the ensuing model is established.  This allows presently-acquired data unique to the individual being monitored to be input into the machine learning model in order to determine whether the current (that is to say, real-time) activity from the individual indicates whether the risk of a particular medical condition is heightened.”  EN:  Sobol teaches: the system 1 configured to analyze the significance of the data stored in local or remote memory 173B and input into one or more of the machine learning algorithms such that once the algorithm is optimized through validation and testing of respective data sets 1620, 1630, a suitable classification rule for use in the ensuing model is established.);

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify FAULHABER’s invention of machine learning model training, by incorporating the teachings of Sobol that teaches “wherein the data comprises tensor- level numerical values; and”, “aggregate the data;”, and “perform an analysis of aggregated data, wherein the analysis of the aggregated data comprises evaluation of one or more rules”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER to supplant conventional data acquisition components and associated computer systems (see Sobol, par [0008]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   


Regarding claim 2, FAULHABER and Sobol teaches: 
(Currently amended) The system as recited in claim 1 (please see claim 1 rejection), 
wherein the one or more problems from the training of the machine learning model comprise a discrepancy in data distributions across a plurality of batches or across two or more of the computing devices of the machine learning training cluster (See, e.g., FAULHABER, par [0022]:  “…it can be useful to have several different ML models serving a same purpose.  For example, different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model.  However, deciding which model from multiple models is "better" (and therefore should be used) is not a straightforward task.  In many cases, a consistently "best" model may not exist, and a best model may depend on dynamic factors, such as spiky traffic and/or data distribution drifts.  …”  EN:  FAULHABER teaches: different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model, such as spiky traffic and/or data distribution drifts.).


Regarding claim 3, FAULHABER and Sobol teaches: 
(Currently amended) The system as recited in claim 1 (please see claim 1 rejection), 

FAULHABER does not appear to explicitly teach: 
wherein the one or more problems from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data.

However, Sobol (US 2019/0209022 A1), in the analogous art of training machine learning models, further teaches:

wherein the one or more problems from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100.. …”(emphasis added) Also see, e.g., Sobol, par [0353]: “While the general trajectory for all of these conditions is a downward worsening in functional status that ultimately ends in death, it is the presence of various acute, crisis or unstable stages along the trajectory that are of most interest to the present disclosure and the LEAP data being acquired by the wearable electronic device 100 for analysis on it or the system 1.”  EN:  Sobol teaches: a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121, where the general trajectory for all of these conditions is a downward worsening in functional status.).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify the invention of FAULHABER and Sobol combination for machine learning model training, by incorporating the additional teachings of Sobol that teaches “wherein the one or more problems from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data.”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER and Sobol because: certain acute events may signal transitions between subsequent ones of the identifiable phases PH along the downward trajectory (see Sobol, par [0353]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   

Regarding claim 4, FAULHABER and Sobol teaches: 
(Currently amended) The system as recited in claim 1 (please see claim 1 rejection),
wherein the training of the machine learning model is discontinued based at least in part on the one or more problems from the training of the machine learning model (See, e.g., FAULHABER, FIG. 8; par [0081]:  “…The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829, thereby relieving the user from the burden of having to worry about over-utilization (e.g., acquiring too little computing resources and suffering performance issues) or under-utilization (e.g., acquiring more computing resources than necessary to train the machine learning models, and thus overpaying).”  EN:  FAULHABER teaches: The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829.). 


Regarding claim 5, FAULHABER teaches: 
(Currently amended) A computer-implemented method (See, e.g., FAULHABER, FIG. 7; par [0069]:  “FIG. 7 is a flow diagram illustrating operations 1000 of a method for dynamic accuracy-based deployment of machine learning models according to some embodiments. …”  EN:  FAULHABER discloses: FIG. 7 is a flow diagram illustrating operations 1000 of a method for dynamic accuracy-based deployment of machine learning models.), comprising:
detecting, by the machine learning analysis system, one or more conditions from the training of the machine learning model based at least in part on the analysis (See, e.g., FAULHABER, FIG. 1; par [0034]:  “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …”  Also see, e.g., FAULHABER, par [0068]: “…the dashboard could be interactive and allow a user to change how traffic is passed to models, add models to groups, pull models out of groups, etc., and can also have alarming and alerting (e.g., to indicate that a model is not performing well). …”  EN:  FAULHABER teaches: determine the quality of the inferences of the ML model(s) that may indicate that a model is not performing well.); and 
generating, by the machine learning analysis system, one or more alarms describing the one or more conditions from the training of the machine learning model (See, e.g., FAULHABER, par [0068]:  “…the dashboard could be interactive and allow a user to change how traffic is passed to models, add models to groups, pull models out of groups, etc., and can also have alarming and alerting (e.g., to indicate that a model is not performing well).” EN:  FAULHABER teaches: alarming and alerting (e.g., to indicate that a model is not performing well).).

While FAULHABER (e.g., FIG. 1; par [0034]: “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …”) teaches: data produced from a training of the machine learning model,

FAULHABER does not appear to explicitly teach: 
receiving, by a machine learning analysis system, data [produced from a training of a machine learning model], wherein the data [produced from the training of the machine learning model] is collected by one or more computing devices of a machine learning training cluster;
performing, by the machine learning analysis system, an analysis of the data [produced from the training of the machine learning model];

However, Sobol (US 2019/0209022 A1), in an analogous art of training machine learning models, teaches:
receiving, by a machine learning analysis system, data [produced from a training of a machine learning model], wherein the data [produced from the training of the machine learning model] is collected by one or more computing devices of a machine learning training cluster (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100.. …”  Also see, e.g., Sobol, FIG. 2F: MEMORY 173B, par [0249]: “Such a model, as well as its corresponding algorithmic tools, may include configuring processor 173A to execute programmed software instructions by using a predefined set of machine code 173E such that the neural network which may receive data from the various sensors 121 in the form of various nodes 2100A, 2100B, 2100C .  . . 2100N of the input layer 2100, where each of these input nodes 2100A, 2100B, 2100C .  . . 2100N can store its corresponding input data value within a particular location of memory 173B.”  EN:  Sobol teaches: Within the machine learning context, data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector", wherein an instance is an observation of the data being collected, and further defined with an attribute that is a specific numerical value of that particular instance, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100, and stored within a particular location of memory 173B by using a predefined set of machine code 173E.); and  

performing, by the machine learning analysis system, an analysis of the data [produced from the training of the machine learning model] (See, e.g., Sobol, FIG. 6, par [0292]:  “…the system 1 may be configured to analyze the significance of the data either with intra-patient or inter-patient baseline data 1700, including for algorithmic training where a data set may be stored in local or remote memory 173B that contains classified or labeled examples with known instances of location, movement or other useful biometric measures of individual activity.  This training data set 16010 may be input into one or more of the machine learning algorithms discussed herein such that once the algorithm is optimized through validation and testing of respective data sets 1620, 1630, a suitable classification rule for use in the ensuing model is established.  This allows presently-acquired data unique to the individual being monitored to be input into the machine learning model in order to determine whether the current (that is to say, real-time) activity from the individual indicates whether the risk of a particular medical condition is heightened.”  EN:  Sobol teaches: the system 1 configured to analyze the significance of the data stored in local or remote memory 173B and input into one or more of the machine learning algorithms such that once the algorithm is optimized through validation and testing of respective data sets 1620, 1630, a suitable classification rule for use in the ensuing model is established.);

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify FAULHABER’s invention of machine learning model training, by incorporating the teachings of Sobol that teaches “receiving, by a machine learning analysis system, data, wherein the data is collected by one or more computing devices of a machine learning training cluster;”, and “performing, by the machine learning analysis system, an analysis of the data;”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER to supplant conventional data acquisition components and associated computer systems (see Sobol, par [0008]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   


Regarding claim 6, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection), 

While FAULHABER (e.g., FIG. 1; par [0034]: “The data 136 may include, for example, the input data 134 (e.g., provided in, or identified by, a request 132), the individual inference results 142 generated by the ML models 118, etc. The analytics engine 122 can determine, using such data 136, the quality of the inferences of the ML model(s) 118. …”) teaches: data produced from the training of the machine learning model, and FAULHABER (e.g., FIG. 8, par [0077]:  “…a user device 802 can provide a training request to the frontend 829 that includes a container image (or multiple container images, or an identifier of one or multiple locations where container images are stored), an indicator of input data (e.g., an address or location of input data), one or more hyperparameter values (e.g., values indicating how the algorithm will operate, how many algorithms to run in parallel, how many clusters into which to separate data, etc.), and/or information describing the computing machine on which to train a machine learning model (e.g., a graphical processing unit (GPU) instance type, a central processing unit (CPU) instance type, an amount of memory to allocate, a type of virtual machine instance to use for training, etc.).”) teaches: data output from one or more graphics processing units (GPUs) of the machine learning training cluster,

FAULHABER does not appear to explicitly teach: 
wherein the data [produced from the training of the machine learning model] comprises tensor data [output from one or more graphics processing units (GPUs) of the machine learning training cluster], and wherein the data [produced from the training of the machine learning model] is aggregated prior to the analysis of the data associated with the training of the machine learning model.

However, Sobol (US 2019/0209022 A1), in the analogous art of training machine learning models, further teaches:

wherein the data [produced from the training of the machine learning model] comprises tensor data [output from one or more graphics processing units (GPUs) of the machine learning training cluster] (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100.. …”(emphasis added)  EN:  Sobol teaches: a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features.), and wherein the data [produced from the training of the machine learning model] is aggregated prior to the analysis of the data associated with the training of the machine learning model (See, e.g., Sobol, FIG. 2F, par [0164]:  “…the sensors 121 may act in conjunction with one another--as well as with instructions that are stored on a machine-readable medium such as memory 173B--to aggregate (or fuse) the acquired data in order to infer certain activities, conditions or circumstances. …” (emphasis added) EN:  Sobol teaches: aggregate (or fuse) the acquired data.).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify the invention of FAULHABER and Sobol combination for machine learning model training, by incorporating the additional teachings of Sobol that teaches “wherein the data comprises tensor data, and wherein the data is aggregated prior to the analysis of the data associated with the training of the machine learning model.”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER to supplant conventional data acquisition components and associated computer systems (see Sobol, par [0008]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   

Regarding claim 7, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),
wherein the one or more conditions from the training of the machine learning model comprise a discrepancy in data distributions across a plurality of batches (See, e.g., FAULHABER, par [0022]:  “…it can be useful to have several different ML models serving a same purpose.  For example, different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model.  However, deciding which model from multiple models is "better" (and therefore should be used) is not a straightforward task.  In many cases, a consistently "best" model may not exist, and a best model may depend on dynamic factors, such as spiky traffic and/or data distribution drifts.  …” (emphasis added)   EN:  FAULHABER teaches: different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model, such as spiky traffic and/or data distribution drifts.).


Regarding claim 8, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),
wherein the one or more conditions from the training of the machine learning model comprise a discrepancy in data distributions across two or more of the computing devices of the machine learning training cluster (See, e.g., FAULHABER, par [0022]:  “…it can be useful to have several different ML models serving a same purpose.  For example, different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model.  However, deciding which model from multiple models is "better" (and therefore should be used) is not a straightforward task.  In many cases, a consistently "best" model may not exist, and a best model may depend on dynamic factors, such as spiky traffic and/or data distribution drifts.  …” (emphasis added)  EN:  FAULHABER teaches: different ML models can be constructed using different training data, preprocessing operations, training parameters, model objectives, post-processing operations, or anything else that affects a final model, such as spiky traffic and/or data distribution drifts.).


Regarding claim 9, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),

FAULHABER does not appear to explicitly teach: 
wherein the one or more conditions from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data.

However, Sobol (US 2019/0209022 A1), in the analogous art of training machine learning models, further teaches:

wherein the one or more conditions from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100.. …”(emphasis added) Also see, e.g., Sobol, par [0353]: “While the general trajectory for all of these conditions is a downward worsening in functional status that ultimately ends in death, it is the presence of various acute, crisis or unstable stages along the trajectory that are of most interest to the present disclosure and the LEAP data being acquired by the wearable electronic device 100 for analysis on it or the system 1.”  EN:  Sobol teaches: a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121, where the general trajectory for all of these conditions is a downward worsening in functional status.).


It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify the invention of FAULHABER and Sobol combination for machine learning model training, by incorporating the additional teachings of Sobol that teaches “wherein the one or more problems from the training of the machine learning model comprise a vanishing gradient or an exploding gradient detected in tensor-level data.”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER and Sobol because: certain acute events may signal transitions between subsequent ones of the identifiable phases PH along the downward trajectory (see Sobol, par [0353]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   



Regarding claim 10, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),
FAULHABER does not appear to explicitly teach: 
wherein the one or more conditions from the training of the machine learning model comprise an overflow or an underflow detected in tensor-level data.

However, Sobol (US 2019/0209022 A1), in the analogous art of training machine learning models, further teaches:

wherein the one or more conditions from the training of the machine learning model comprise an overflow or an underflow detected in tensor-level data (See, e.g., Sobol, FIG. 2F, par [0243]:  “Within the machine learning context,… terms related to the data being acquired, analyzed and reported include "instance", "label", "feature" and "feature vector".  An instance is an example or observation of the data being collected, and may be further defined with an attribute (or input attribute) that is a specific numerical value of that particular instance, while a label is the output, target or answer that the machine learning algorithm is attempting to solve, the feature is a numerical value that corresponds to an input or input variable in the form of the sensed parameters, whereas a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121 or other data-gathering components of the wearable electronic device 100.. …”(emphasis added) Also see, e.g., Sobol, par [0353]: “While the general trajectory for all of these conditions is a downward worsening in functional status that ultimately ends in death, it is the presence of various acute, crisis or unstable stages along the trajectory that are of most interest to the present disclosure and the LEAP data being acquired by the wearable electronic device 100 for analysis on it or the system 1.”  EN:  Sobol teaches: a feature vector is a multidimensional representation (that is to say, vector, array or tensor) of the various features that are used to represent the object, phenomenon or thing that is being measured by the sensors 121, where the general trajectory for all of these conditions is a downward worsening in functional status.).


It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to beneficially modify the invention of FAULHABER and Sobol combination for machine learning model training, by incorporating the additional teachings of Sobol that teaches “wherein the one or more conditions from the training of the machine learning model comprise an overflow or an underflow detected in tensor-level data”.  A person having ordinary skill in the art would have been motivated toward such a combination to improve FAULHABER and Sobol because: certain acute events may signal transitions between subsequent ones of the identifiable phases PH along the downward trajectory (see Sobol, par [0353]).  FAULHABER and Sobol are analogous arts directed generally to training machine learning models.   


Regarding claim 11, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),
wherein the data produced from the training of the machine learning model comprises data describing resource utilization of the machine learning training cluster, and wherein the one or more conditions from the training of the machine learning model represent a violation of one or more resource utilization thresholds (See, e.g., FAULHABER, FIG. 8, par [0096]:  “…the model metrics can indicate that the machine learning model is performing poorly (e.g., has an error rate above a threshold value, has a statistical distribution that is not an expected or desired distribution (e.g., not a binomial distribution, a Poisson distribution, a geometric distribution, a normal distribution, Gaussian distribution, etc.), has an execution latency above a threshold value, has a confidence level below a threshold value)) and/or is performing progressively worse (e.g., the quality metric continues to worsen over time). …”  EN:  FAULHABER teaches: the model metrics can indicate that the machine learning model is performing poorly, e.g., has an error rate above a threshold value.).

Regarding claim 12, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection),
wherein the data produced from with the training of the machine learning model comprises data describing resource utilization of the machine learning training cluster, and wherein the one or more conditions from the training of the machine learning model represent one or more performance bottlenecks identified in one or more resources of the machine learning training cluster (See, e.g., FAULHABER, FIG. 8, par [0096]:  “…the model metrics can indicate that the machine learning model is performing poorly (e.g., has an error rate above a threshold value, has a statistical distribution that is not an expected or desired distribution (e.g., not a binomial distribution, a Poisson distribution, a geometric distribution, a normal distribution, Gaussian distribution, etc.), has an execution latency above a threshold value, has a confidence level below a threshold value)) and/or is performing progressively worse (e.g., the quality metric continues to worsen over time). …”  EN:  FAULHABER teaches: the model metrics can indicate that the machine learning model is performing poorly (e.g., has an error rate above a threshold value, has a statistical distribution that is not an expected or desired distribution (e.g., not a binomial distribution, a Poisson distribution, a geometric distribution, a normal distribution, Gaussian distribution, etc.), has an execution latency above a threshold value, has a confidence level below a threshold value)).). 

Regarding claim 13, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection), further comprising: 

discontinuing the training of the machine learning model based at least in part on the one or more conditions from the training of the machine learning model (See, e.g., FAULHABER, FIG. 8; par [0081]:  “…The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829, thereby relieving the user from the burden of having to worry about over-utilization (e.g., acquiring too little computing resources and suffering performance issues) or under-utilization (e.g., acquiring more computing resources than necessary to train the machine learning models, and thus overpaying).”  EN:  FAULHABER teaches: The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829.); 

modifying a configuration of the training of the machine learning model (See, e.g., FAULHABER, FIG. 8; par [0096]:  “…transmit a request to the model training system 820 to modify the machine learning model being trained (e.g., transmit a modification request).. …”  EN:  FAULHABER teaches: modify the machine learning model being trained.); and 

restarting the training of the machine learning model according to the configuration (See, e.g., FAULHABER, FIG. 8; par [0096]:  “…execute the code 836 stored in the new ML training container 830 to restart the machine learning model training process. …”  EN:  FAULHABER teaches: execute the code 836 stored in the new ML training container 830 to restart the machine learning model training process.).

Regarding claim 14, FAULHABER and Sobol teaches: 
(Currently amended) The method as recited in claim 5 (please see claim 5 rejection), further comprising: 
discontinuing use of the machine learning training cluster based at least in part on the one or more conditions from the training of the machine learning model (See, e.g., FAULHABER, FIG. 8; par [0081]:  “…The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829, thereby relieving the user from the burden of having to worry about over-utilization (e.g., acquiring too little computing resources and suffering performance issues) or under-utilization (e.g., acquiring more computing resources than necessary to train the machine learning models, and thus overpaying).”  EN:  FAULHABER teaches: The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829.).

Claims 15-19:
	Media Claims 15-19 are similar to rejected method Claims 5, 7, 8, 9 and 11, respectively.  
As such, Claims 15-19 are rejected under AIA  35 U.S.C. 103 as being un-patentable by FAULHABER and Sobol combinations for similar rationale.

Regarding claim 20, FAULHABER and Sobol teaches: 
(Currently amended) The one or more non-transitory computer-readable storage media as recited in claim 15 (please see claim 15 rejection), further comprising additional program instructions that, when executed on or across the one or more processors, perform: 
discontinuing the training of the machine learning model based at least in part on the one or more problems from the training of the machine learning model (See, e.g., FAULHABER, FIG. 8; par [0081]:  “…The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829, thereby relieving the user from the burden of having to worry about over-utilization (e.g., acquiring too little computing resources and suffering performance issues) or under-utilization (e.g., acquiring more computing resources than necessary to train the machine learning models, and thus overpaying).”  EN:  FAULHABER discloses: The model training system 820 can automatically scale up and down based on the volume of training requests received from user devices 802 via frontend 829.).

			Response to Arguments/Remarks
7.	Applicant’s Arguments/Remarks filed on 08/12/2022, have been fully considered by Examiner but they are not persuasive to overcome the prior-art rejections, as they are either ineffective or moot in view of the new grounds of rejections used in this office action as necessitated by Applicant’s amendments.  


Conclusion
8.	Claims 1-20 are rejected.
THIS ACTION IS NON-FINAL. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED HUDA whose telephone number is (571)270-7171. The examiner can normally be reached on Monday - Friday 9AM -5:30PM Eastern Time. The fax number and the email address for the examiner is (571)270-8171 and Mohammed.Huda@USPTO.GOV. Please note that an applicant can send email messages to the examiner but the examiner cannot send email messages to the applicant without written authorization from the applicant. An applicant can authorize the examiner for email communication by mentioning the following in an email, “According to MPEP 502.03, recognizing that Internet communications are not secure, I hereby authorize the examiner to communicate with me concerning any subject matter of this application by electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached on (571)272-3708. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571 -272-1000.


/MOHAMMED  HUDA/					September 20, 2022
Examiner, Art Unit 2191	


/WEI Y ZHEN/Supervisory Patent Examiner, Art Unit 2191