DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This action is in response to submission filed 9 March 2022 for application 16/142,812. Claims 1, 9, and 17 have been amended. Currently claims 1-20 are pending and have been examined.
Applicant’s arguments with respect to the §101 rejection of claims 1-20 have been fully considered and are persuasive. The §101 rejection of claims 1-20 have been withdrawn in view of the arguments and amendments made.

	
Response to Arguments
Applicant’s arguments, see Pages 9-11 of remarks, filed 09 March 2022, with respect to the rejection of claims 1-20 under 35 USC § 103 have been fully considered but are not persuasive. Applicant argues that the cited references Coffman, Toyama, Le, Philips, and Meek, alone or in combination, fail to teach or fairly suggest at least the quoted features (see page 10 of remarks) of independent claim 1. Examiner respectfully disagrees because the combination of Coffman, Toyama, and Le teach every element of the amended independent claim 1 (and similarly independent claims 9 and 17) as shown in detail below.
Specifically, applicant argues on page 10 of remarks, that the cited reference combination fails to teach, “receiving process data from a plurality of equipment of a manufacturing process, the process data comprising time series sensor data of a state variable of the plurality of equipment; using a processor to perform discretization modeling of a continuous probability distribution based on the time series sensor data to yield a prediction of a future probability distribution for the state variable”, and “modifying an operating condition of at least one of the plurality of equipment of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function for the forecast horizon”. Examiner respectfully disagrees because the combination of Coffman and Toyama teaches the above quoted limitations.
Column 8, Lines 64-66 of Coffman states, FIG. 3 is a flowchart illustrating a method to generate a set of predictions associated with manufacture processes of a physical object, which under the broadest reasonable interpretation, examiner is interpreting as, receiving process data  of a manufacturing process, noting to also see Fig. 1. Column 1, Lines 18-19 of Toyama states, data collected from a sensor, which under the broadest reasonable interpretation, examiner is interpreting as, from a plurality of equipment. Column 7, Lines 15-18 of Toyama states, in addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention and Column 1, Lines 17-19, of Toyama states, in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor, which under the broadest reasonable interpretation, examiner is interpreting as, the process data comprising time series sensor data of a state variable of the plurality of equipment. Hence, the combination of Coffman and Toyama teaches “receiving process data from a plurality of equipment of a manufacturing process, the process data comprising time series sensor data of a state variable of the plurality of equipment”.
Column 5, Lines 48-52 of Coffman states, from these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process and Column 3, Lines 42-45 states, such a Predictive System for Manufacture Processes (hereinafter PSMP) server 109 can provide predictions or estimations in near real-time regarding manufacturing processes and Column 17, Lines 43-45 states, the implementations described below are discussed in the context of log normal distributions, which under the broadest reasonable interpretation, examiner is interpreting as, using a processor to perform discretization modeling of a continuous probability distribution to yield a prediction of a future probability distribution for the state variable, noting that log normal distributions correspond to continuous probability distribution. Column 7, Lines 15-18 of Toyama states, in addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention and Column 1, Lines 17-19 states, in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor, which under the broadest reasonable interpretation, examiner is interpreting as, based on the time series sensor data. Hence, the combination of Coffman and Toyama teaches “using a processor to perform discretization modeling of a continuous probability distribution based on the time series sensor data to yield a prediction of a future probability distribution for the state variable”.
Column 17, Lines 58-65 of Toyama states, tracking of visual data is accomplished by use of a Bayesian network that is trained and structured offline by use of dynamic sensor data for determining object position in conjunction with position estimates provided by each modality. Thus, the trained and structured Bayesian modality fusion of the present invention accomplishes visual tracking by adapting its estimates by detecting changes in indicators, which under the broadest reasonable interpretation, examiner is interpreting as, and modifying an operating condition of at least one of the plurality of equipment. Column 18, Lines 10-20 of Coffman states, the requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217 such that a set of axioms and/or attributes associated with the requested manufacturing process can be optimized according to one or more competing objectives and/or conditions, which under the broadest reasonable interpretation, examiner is interpreting as, of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function for the forecast horizon. Hence, the combination of Coffman and Toyama teaches “modifying an operating condition of at least one of the plurality of equipment of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function for the forecast horizon”.
Furthermore, applicant argues om pages 10 and 11 that Toyama appears silent with respect to making any assumptions on the parametric form of the underlying probability distribution of the state variable. As shown in the detailed rejection below, Column 7, Paragraph 3 of Toyama states, for each sensor modality 214, nodes 212, 218 and 220 are variables that are instantiated by the sensor modality 214 and nodes 210 and 216 represent inferred values. In particular, node 210 is a target ground truth node that represents an unknown state of the target object, which under the broadest reasonable interpretation, examiner is interpreting as, without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable, and Column 20, Line 63 of Coffman states, parametric probability distribution, which under the broadest reasonable interpretation, examiner is interpreting as, a parametric form of. Hence, the combination of Toyama and Coffman teach “without making any assumptions on a parametric form of an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable”.
Lastly, since the dependent claims depend either directly or indirectly from applicant’s amended independent claims 1, 9, and 17, they inherit the same rejection as shown in detail below.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 2, 4-7, 9, 10, 12-15, 17, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Coffman et al (US 10061300 B1) in view of Toyama (US 6502082 B1) and Le et al (Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network, 2017). 

Regarding claim 1
Coffman teaches: A computer-implemented method comprising: receiving process data  of a manufacturing process ([Column 8, Lines 64-66] FIG. 3 is a flowchart illustrating a method to generate a set of predictions associated with manufacture processes of a physical object. Note: Also see Fig. 1)
using a processor to perform discretization modeling of a continuous probability distribution to yield a prediction of a future probability distribution for a state variable ([Column 5, Lines 48-52] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process. [Column 3, Lines 42-45] Such a Predictive System for Manufacture Processes (hereinafter PSMP) server 109 can provide predictions or estimations in near real-time regarding manufacturing processes. [Column 17, Lines 43-45] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution) a parametric form of ([Column 20, Line 63]  parametric probability distribution);
using the processor on the prediction of the future probability distribution ([Column 5, Lines 48-52] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process); 
using the processor to perform a multi-step forecast of the prediction of the future probability distribution to create a predicted probability density function for a forecast horizon ([Column 16, Lines 64-67] In some instances, machine learning models selected to build the predictive engine at 611, are further evaluated using an unseen test dataset 609. Thus, the predictive engine built at 611 generates classification values and/or predicted[Column 17, Lines 1- 8] values at 613. Classification and/or prediction values are evaluated at 615 to determine whether such values have achieved a desired accuracy level. When such a desired accuracy level is reached, the training phase ends; when the desired accuracy level is not reached, however, then a subsequent iteration of the process shown in FIG. 6 is performed starting at 601 with variations such as, for example, considering a larger collection of raw data. Note: Subsequent iteration corresponds to multi-step);
using the predicted probability density function for the forecast horizon as an input to a process control system ([Column 8, Lines 12-17] Predictive engine 215 includes a set of trained machine-learning models and other suitable computation models to infer axioms regarding a physical object represented in a digital model and likelihood or probabilities associated with entities of a supply chain predicted for a manufacturing process request. [Column 18, Lines 10-17] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217. Note: Multi-objective optimizer corresponds to the process control system. Also see Fig. 7);
of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function for the forecast horizon ([Column 18, Lines 10-20] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217 such that a set of axioms and/or attributes associated with the requested manufacturing process can be optimized according to one or more competing objectives and/or conditions. Note: manufacturing process can be optimized corresponds to modifying the manufacturing process).
However, Coffman does not explicitly disclose: from a plurality of equipment, the process data comprising time series sensor data of a state variable of the plurality of equipment; based on the time series sensor data, without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable, wherein sensor observations of the state variable comprise a stochastic process; to impose a smoothness condition; and modifying an operating condition of at least one of the plurality of equipment.
Toyama teaches, in an analogous system: from a plurality of equipment ([Column 1, Lines 18-19] data collected from a sensor) the process data comprising time series sensor data of a state variable of the plurality of equipment ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor);
based on the time series sensor data ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor) without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable ([Column 7, Paragraph 3] For each sensor modality 214, nodes 212, 218 and 220 are variables that are instantiated by the sensor modality 214 and nodes 210 and 216 represent inferred values. In particular, node 210 is a target ground truth node that represents an unknown state of the target object) wherein sensor observations of the state variable comprise a stochastic process ([Column 7, Paragraph 4] From a Bayesian perspective, the ground-truth state influences or causes an output from the sensor modality 214 (it should be noted that the use of term "causes" comprises both deterministic and stochastic components));
and modifying an operating condition of at least one of the plurality of equipment ([Column 17, Lines 58-65] Tracking of visual data is accomplished by use of a Bayesian network that is trained and structured offline by use of dynamic sensor data for determining object position in conjunction with position estimates provided by each modality. Thus, the trained and structured Bayesian modality fusion of the present invention accomplishes visual tracking by adapting its estimates by detecting changes in indicators).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Coffman to incorporate the teachings of Toyama to use  the process data comprising time series sensor data of a state variable of the plurality of equipment and based on the time series sensor data and also to use unknown ground truth and sensor observations comprising stochastic process and modifying an operating condition of at least one of the plurality of equipment. One would have been motivated to do this modification because doing so would give the benefit of building probabilistic submodels to dynamically diagnose reliability as taught by Toyama [Column 7, Paragraph 5]. 
Le teaches, in an analogous system: to impose a smoothness condition ([Page 1111, section 5, Column 2, Paragraph 3] a smoothing effect on the time series).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition. One would have been motivated to do this modification because doing so would give the benefit of having the advantage of being completely data-driven as taught by Le [Page 1111, section 5, Column 2, Paragraph 3].

	Regarding claim 2
Coffman teaches: The computer-implemented method of claim 1, wherein discretization modeling of a continuous probability distribution function further comprises using a processor to receive a series of target variables (y), auxiliary observations (x), and control sequences (u) ([27] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process. [148] where p.sub.i is the proportion of observations or samples with a target variable (e.g., SID) and m is the number of different values taken by the target variable. [151] Additionally, certain of the steps may be performed concurrently in a parallel process when possible, as well as performed sequentially as described above. [99] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution).

Regarding claim 4
The system Coffman, Toyama, and Le teaches: The computer-implemented method of claim 1 (as shown above).
However, Coffman does not explicitly disclose: wherein imposing a smoothness condition on the prediction of the future probability distribution comprises using an artificial neural network with softmax function and a regularized cross-entropy loss.
Le teaches, in an analogous system: wherein imposing a smoothness condition on the prediction of the future probability distribution comprises using an artificial neural network with softmax function and a regularized cross-entropy loss ([Page 1111, section 5] In both cases, emotion decoding has a smoothing effect on the time series. [Page 1108, section 1, column 2] We train a multi-task deep bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) with cost-sensitive Cross Entropy (CE) loss to jointly predict label sequences at different resolutions. [Page 1109, section 2.3] For classification tasks, softmax normalization is applied to the output vector. [Page 1108, section 1] The objective of the challenge was to make temporal predictions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition on the predicted probability distribution comprising using an artificial neural network with softmax function and a regularized cross-entropy loss. One would have been motivated to do this modification because doing so would give the benefit of jointly predict label sequences at different resolutions as taught by Le [Page 1108, Section 1, Column 2].

Regarding claim 5
Coffman teaches: The computer-implemented method of claim 4, wherein the artificial neural network is initially trained ([114]  In some implementations, the artificial neural network can be trained).

Regarding claim 6
Coffman teaches: The computer-implemented method of claim 5, wherein the artificial neural network is trained ([114]  In some implementations, the artificial neural network can be trained).
However, Coffman does not explicitly disclose: by minimizing regularized cross-entropy loss.
Le teaches, in an analogous system: by minimizing regularized cross-entropy loss (We train the network to minimize the total cost-sensitive CE (CCE) loss [Page 1110, section 4.2]. Note: CE is the acronym for Cross Entropy  [Page 1108, section 1, column 2]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a minimizing regularized cross-entropy loss. One would have been motivated to do this modification because doing so would give the benefit of jointly predict label sequences at different resolutions as taught by Le [Page 1108, Section 1, Column 2].

Regarding claim 7
Coffman teaches: The computer-implemented method of claim 1, further comprising using a recurrent neural network for prediction of the future probability distribution ([100] In some implementations, regression machine learning models 703 and 705 can include, for example, one or more deep learning models, one or more machine learning clustering models, one or more instance-based machine learning models, one or more kernel-based machine learning models, and/or any combination thereof. Examples of such deep learning models include deep Boltzmann machines. [102] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. Note: Boltzmann machine corresponds to a recurrent neural network).

Regarding claim 9
Coffman teaches: A system comprising: a memory;  a processor coupled to the memory, the processor operable to execute instructions stored in the memory, the instructions causing the processor to (Column 2, Lines 30-33] At least one embodiment described herein includes an apparatus with a processor, and a memory storing instructions which, when executed by the processor, causes the processor to):
receive process data  of a manufacturing process ([Column 8, Lines 64-66] FIG. 3 is a flowchart illustrating a method to generate a set of predictions associated with manufacture processes of a physical object. Note: Also see Fig. 1), 
perform discretization modeling to yield a prediction of a future probability distribution for the state variable ([Column 5, Lines 48-52] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process.[ Column 3, Lines 42-45] Such a Predictive System for Manufacture Processes (hereinafter PSMP) server 109 can provide predictions or estimations in near real-time regarding manufacturing processes. [Column 17, Lines 43-45] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution) a parametric form of ([Column 20, Line 63]  parametric probability distribution);
on the predicted future probability distribution ([Column 5, Lines 48-52] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process); 
perform a multi-step forecast of the predicted future probability distribution to create a predicted probability density function ([Column 16, Lines 64-67] In some instances, machine learning models selected to build the predictive engine at 611, are further evaluated using an unseen test dataset 609. Thus, the predictive engine built at 611 generates classification values and/or predicted [Column 17, Lines 1- 8] values at 613. Classification and/or prediction values are evaluated at 615 to determine whether such values have achieved a desired accuracy level. When such a desired accuracy level is reached, the training phase ends; when the desired accuracy level is not reached, however, then a subsequent iteration of the process shown in FIG. 6 is performed starting at 601 with variations such as, for example, considering a larger collection of raw data. Note: Subsequent iteration corresponds to multi-step);
use the predicted probability density function as an input to a process control system ([Column 8, Lines 12-17] Predictive engine 215 includes a set of trained machine-learning models and other suitable computation models to infer axioms regarding a physical object represented in a digital model and likelihood or probabilities associated with entities of a supply chain predicted for a manufacturing process request. [Column 18, Lines 10-17] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217. Note: Multi-objective optimizer corresponds to the process control system. Also see Fig. 7);
of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function ([Column 18, Lines 10-20] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217 such that a set of axioms and/or attributes associated with the requested manufacturing process can be optimized according to one or more competing objectives and/or conditions. Note: manufacturing process can be optimized corresponds to modifying the manufacturing process).
However, Coffman does not explicitly disclose: from a plurality of equipment, the process data comprising time series sensor data of a state variable of the plurality of equipment; based on the time series sensor data, without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable, wherein sensor observations of the state variable comprise a stochastic process; to impose a smoothness condition; and modifying an operating condition of at least one of the plurality of equipment.
Toyama teaches, in an analogous system: from a plurality of equipment ([Column 1, Lines 18-19] data collected from a sensor) the process data comprising time series sensor data of a state variable of the plurality of equipment ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor);
based on the time series sensor data ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor) without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable ([Column 7, Paragraph 3] For each sensor modality 214, nodes 212, 218 and 220 are variables that are instantiated by the sensor modality 214 and nodes 210 and 216 represent inferred values. In particular, node 210 is a target ground truth node that represents an unknown state of the target object) wherein sensor observations of the state variable comprise a stochastic process ([Column 7, Paragraph 4] From a Bayesian perspective, the ground-truth state influences or causes an output from the sensor modality 214 (it should be noted that the use of term "causes" comprises both deterministic and stochastic components));
and modifying an operating condition of at least one of the plurality of equipment ([Column 17, Lines 58-65] Tracking of visual data is accomplished by use of a Bayesian network that is trained and structured offline by use of dynamic sensor data for determining object position in conjunction with position estimates provided by each modality. Thus, the trained and structured Bayesian modality fusion of the present invention accomplishes visual tracking by adapting its estimates by detecting changes in indicators).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Coffman to incorporate the teachings of Toyama to use  the process data comprising time series sensor data of a state variable of the plurality of equipment and based on the time series sensor data and also to use unknown ground truth and sensor observations comprising stochastic process and modifying an operating condition of at least one of the plurality of equipment. One would have been motivated to do this modification because doing so would give the benefit of building probabilistic submodels to dynamically diagnose reliability as taught by Toyama [Column 7, Paragraph 5]. 
Le teaches, in an analogous system: impose a smoothness condition ([Page 1111, section 5, Column 2, Paragraph 3] a smoothing effect on the time series).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition. One would have been motivated to do this modification because doing so would give the benefit of having the advantage of being completely data-driven as taught by Le [Page 1111, section 5, Column 2, Paragraph 3].

	Regarding claim 10
Coffman teaches: The system of claim 9, wherein discretization modeling of a continuous probability distribution function further comprises receiving a series of target variables (y), auxiliary observations (x), and control sequences (u) ([27] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process. [148] where p.sub.i is the proportion of observations or samples with a target variable (e.g., SID) and m is the number of different values taken by the target variable. [151] Additionally, certain of the steps may be performed concurrently in a parallel process when possible, as well as performed sequentially as described above. [99] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution).

Regarding claim 12
The system Coffman, Toyama, and Le teaches: The system of claim 9 (as shown above).
However, Coffman does not explicitly disclose: wherein imposing a smoothness condition on the predicted future probability distribution comprises using an artificial neural network with softmax function and a regularized cross-entropy loss.
Le teaches, in an analogous system: wherein imposing a smoothness condition on the predicted probability distribution comprises using an artificial neural network with softmax function and a regularized cross-entropy loss ([Page 1111, section 5] In both cases, emotion decoding has a smoothing effect on the time series. [Page 1108, section 1, column 2] We train a multi-task deep bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) with cost-sensitive Cross Entropy (CE) loss to jointly predict label sequences at different resolutions. [Page 1109, section 2.3] For classification tasks, softmax normalization is applied to the output vector. [Page 1108, section 1] The objective of the challenge was to make temporal predictions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition on the predicted probability distribution comprising using an artificial neural network with softmax function and a regularized cross-entropy loss. One would have been motivated to do this modification because doing so would give the benefit of jointly predict label sequences at different resolutions as taught by Le [Page 1108, Section 1, Column 2].

Regarding claim 13
Coffman teaches: The system of claim 12, wherein the artificial neural network is initially trained ([114] In some implementations, the artificial neural network can be trained).

Regarding claim 14
Coffman teaches: The system of claim 13, wherein the artificial neural network is trained ([114] In some implementations, the artificial neural network can be trained).
However, Coffman does not explicitly disclose: by minimizing regularized cross-entropy loss.
Le teaches, in an analogous system: by minimizing regularized cross-entropy loss (We train the network to minimize the total cost-sensitive CE (CCE) loss [Page 1110, section 4.2]. Note: CE is the acronym for Cross Entropy [Page 1108, section 1, column 2]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a minimizing regularized cross-entropy loss. One would have been motivated to do this modification because doing so would give the benefit of jointly predict label sequences at different resolutions as taught by Le [Page 1108, Section 1, Column 2].

Regarding claim 15
Coffman teaches: The system of claim 9, further comprising a recurrent neural network for prediction of the future probability distribution ([100] In some implementations, regression machine learning models 703 and 705 can include, for example, one or more deep learning models, one or more machine learning clustering models, one or more instance-based machine learning models, one or more kernel-based machine learning models, and/or any combination thereof. Examples of such deep learning models include deep Boltzmann machines. [102] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. Note: Boltzmann machine corresponds to a recurrent neural network).

Regarding claim 17
Coffman teaches: A computer program product for controlling a process comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer, to cause the computer to perform a method comprising: ([Column 27, Lines 60-65] Some embodiments described herein relate to devices with a non-transitory computer-readable medium (also can be referred to as a non-transitory processor-readable medium or memory) having instructions or computer code thereon for performing various computer-implemented operations):
receiving process data  of a manufacturing process ([Column 8, Lines 64-66] FIG. 3 is a flowchart illustrating a method to generate a set of predictions associated with manufacture processes of a physical object. Note: Also see Fig. 1),
discretization modeling, by a processor, of a continuous probability distribution to yield a prediction of a future probability distribution for the state variable ([Column 5, Lines 48-52]  From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process. [Column 3, Lines 42-45] Such a Predictive System for Manufacture Processes (hereinafter PSMP) server 109 can provide predictions or estimations in near real-time regarding manufacturing processes. [Column 17, Lines 43-45] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution) a parametric form of ([Column 20, Line 63]  parametric probability distribution);
imposing, by  the processor, on the predicted future probability distribution ([Column 5, Lines 48-52] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process); 
performing, by the processor, a multi-step forecast of the predicted future probability distribution to create a predicted probability density function ([Column 16, Lines 64-67] In some instances, machine learning models selected to build the predictive engine at 611, are further evaluated using an unseen test dataset 609. Thus, the predictive engine built at 611 generates classification values and/or predicted [Column 17, Lines 1- 8] values at 613. Classification and/or prediction values are evaluated at 615 to determine whether such values have achieved a desired accuracy level. When such a desired accuracy level is reached, the training phase ends; when the desired accuracy level is not reached, however, then a subsequent iteration of the process shown in FIG. 6 is performed starting at 601 with variations such as, for example, considering a larger collection of raw data. Note: Subsequent iteration corresponds to multi-step);
using, by the processor, the predicted probability density function as an input to a process control system ([Column 8, Lines 12-17] Predictive engine 215 includes a set of trained machine-learning models and other suitable computation models to infer axioms regarding a physical object represented in a digital model and likelihood or probabilities associated with entities of a supply chain predicted for a manufacturing process request. [Column 18, Lines 10-17] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217. Note: Multi-objective optimizer corresponds to the process control system. Also see Fig. 7);
of the manufacturing process during a control phase using the process control system, wherein the process control system modifies the operating condition based on the input of the predicted probability density function ([Column 18, Lines 10-20] The requestor entity identifier, one or more of the data values included in inputs 106 and 701, and/or a quoted value for MPR 106 are input into regression machine learning model 703 to produce a set of parameters to define a probability distribution function that describes probabilities that a manufacturing process will be authorized by the requestor entity. In some implementations, the set of parameters is sent to multi-objective optimizer 217 such that a set of axioms and/or attributes associated with the requested manufacturing process can be optimized according to one or more competing objectives and/or conditions. Note: manufacturing process can be optimized corresponds to modifying the manufacturing process).
However, Coffman does not explicitly disclose: from a plurality of equipment, the process data comprising time series sensor data of a state variable of the plurality of equipment; based on the time series sensor data, without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable, wherein sensor observations of the state variable comprise a stochastic process; to impose a smoothness condition; and modifying an operating condition of at least one of the plurality of equipment.
Toyama teaches, in an analogous system: from a plurality of equipment ([Column 1, Lines 18-19] data collected from a sensor) the process data comprising time series sensor data of a state variable of the plurality of equipment ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor);
based on the time series sensor data ([Column 7, Lines 15-18] In addition, conceptual links between Bayesian networks and probabilistic time-series analysis tools such as hidden Markov models (HMMs) and Kalman filters can be implemented in the present invention. [Column 1, Lines 17-19] in particular to a model such as a Bayesian network, that can be trained offline from data collected from a sensor) without making any assumptions on an underlying probability distribution of the state variable, wherein a ground truth of the state variable is not observable ([Column 7, Paragraph 3] For each sensor modality 214, nodes 212, 218 and 220 are variables that are instantiated by the sensor modality 214 and nodes 210 and 216 represent inferred values. In particular, node 210 is a target ground truth node that represents an unknown state of the target object) wherein sensor observations of the state variable comprise a stochastic process ([Column 7, Paragraph 4] From a Bayesian perspective, the ground-truth state influences or causes an output from the sensor modality 214 (it should be noted that the use of term "causes" comprises both deterministic and stochastic components));
and modifying an operating condition of at least one of the plurality of equipment ([Column 17, Lines 58-65] Tracking of visual data is accomplished by use of a Bayesian network that is trained and structured offline by use of dynamic sensor data for determining object position in conjunction with position estimates provided by each modality. Thus, the trained and structured Bayesian modality fusion of the present invention accomplishes visual tracking by adapting its estimates by detecting changes in indicators).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Coffman to incorporate the teachings of Toyama to use  the process data comprising time series sensor data of a state variable of the plurality of equipment and based on the time series sensor data and also to use unknown ground truth and sensor observations comprising stochastic process and modifying an operating condition of at least one of the plurality of equipment. One would have been motivated to do this modification because doing so would give the benefit of building probabilistic submodels to dynamically diagnose reliability as taught by Toyama [Column 7, Paragraph 5]. 
Le teaches, in an analogous system: a smoothness condition ([Page 1111, section 5, Column 2, Paragraph 3] a smoothing effect on the time series).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition. One would have been motivated to do this modification because doing so would give the benefit of having the advantage of being completely data-driven as taught by Le [Page 1111, section 5, Column 2, Paragraph 3].

	Regarding claim 18
Coffman teaches: The computer program product of claim 17, wherein discretization modeling of a continuous probability distribution function further comprises receiving, by the processor, a series of target variables (y), auxiliary observations (x), and control sequences (u) ([27] From these various memory units, processor 207 can retrieve instructions to execute and/or data to perform processes for discretization, manufacturability analysis, optimization and predictions associated with manufacturing process. [148] where p.sub.i is the proportion of observations or samples with a target variable (e.g., SID) and m is the number of different values taken by the target variable. [151] Additionally, certain of the steps may be performed concurrently in a parallel process when possible, as well as performed sequentially as described above. [99] The implementations described below are discussed in the context of log normal distributions. Note: log normal distributions correspond to continuous probability distribution).

Regarding claim 20
The system Coffman, Toyama, and Le teaches: The computer program product of claim 17 (as shown above).
However, Coffman does not explicitly disclose: wherein imposing a smoothness condition on the prediction of the future probability distribution comprises using, by the processor, an artificial neural network with softmax function and a regularized cross-entropy loss.
Le teaches, in an analogous system: wherein imposing a smoothness condition on the prediction of the future probability distribution comprises using, by the processor, an artificial neural network with softmax function and a regularized cross-entropy loss ([Page 1111, section 5] In both cases, emotion decoding has a smoothing effect on the time series. [Page 1108, section 1, column 2] We train a multi-task deep bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) with cost-sensitive Cross Entropy (CE) loss to jointly predict label sequences at different resolutions. [Page 1109, section 2.3] For classification tasks, softmax normalization is applied to the output vector. [Page 1108, section 1] The objective of the challenge was to make temporal predictions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman and Toyama to incorporate the teachings of Le to use a smoothness condition on the predicted probability distribution comprising using an artificial neural network with softmax function and a regularized cross-entropy loss. One would have been motivated to do this modification because doing so would give the benefit of jointly predict label sequences at different resolutions as taught by Le [Page 1108, Section 1, Column 2].

Claims 3, 11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable, over Coffman et al (US 10061300 B1) in view of Toyama (Clustering of time series data—a survey, 2005) and Le et al (Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network, 2017) as applied to claims 1, 9, and 17 above and further in view of Phillips (Joint Probability and Independence for Continuous RVs, 2014).
Regarding claim 3
The system Coffman, Toyama, and Le teaches: The computer-implemented method of claim 2, wherein discretization modeling of (As shown above).
However, the system Coffman, Toyama, and Le do not explicitly disclose: a continuous probability distribution function is defined by a formula.
Phillips teaches, in an analogous system: a continuous probability distribution function is defined by a formula (The last but one equation on Page 2 corresponds to the formula).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman, Toyama, and Le to incorporate the teachings of Phillips to use the equation. One would have been motivated to do this modification because doing so would give the benefit of finding conditional probabilities by integrating over an interval as taught by Phillips paragraph [Page 2, section conditional probability].

Regarding claim 11
The system Coffman, Toyama, and Le teaches: The system of claim 10, wherein discretization modeling of (As shown above).
However, the system Coffman, Toyama, and Le do not explicitly disclose: a continuous probability distribution function is defined by a formula.
Phillips teaches, in an analogous system: a continuous probability distribution function is defined by a formula (The last but one equation on Page 2 corresponds to the formula).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman, Toyama, and Le to incorporate the teachings of Phillips to use the equation. One would have been motivated to do this modification because doing so would give the benefit of finding conditional probabilities by integrating over an interval as taught by Phillips paragraph [Page 2, section conditional probability].

Regarding claim 19
The system Coffman, Toyama, and Le teaches: The computer program product of claim 18, wherein discretization modeling of (As shown above).
However, the system Coffman, Toyama, and Le do not explicitly disclose: a continuous probability distribution function is defined by a formula.
Phillips teaches, in an analogous system: a continuous probability distribution function is defined by a formula (The last but one equation on Page 2 corresponds to the formula).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman, Toyama, and Le to incorporate the teachings of Phillips to use the equation. One would have been motivated to do this modification because doing so would give the benefit of finding conditional probabilities by integrating over an interval as taught by Phillips paragraph [Page 2, section conditional probability].

Claims 8 and 16 are rejected under 35 U.S.C. 103 as being unpatentable, over Coffman et al (US 10061300 B1) in view of Toyama (Clustering of time series data—a survey, 2005), and Le et al (Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network, 2017) as applied to claims 1, 9, and 17 above and further in view of Meek et al (US 7660705 B1).
Regarding claim 8
The system Coffman, Toyama, and Le teaches: The computer-implemented method of claim 1 (As shown above).
Coffman further teaches: wherein performing a multi-step forecast of the prediction of the future probability distribution to create a predicted probability density function …([98]  In some instances, machine learning models selected to build the predictive engine at 611, are further evaluated using an unseen test dataset 609. Thus, the predictive engine built at 611 generates classification values and/or predicted values at 613. Classification and/or prediction values are evaluated at 615 to determine whether such values have achieved a desired accuracy level. When such a desired accuracy level is reached, the training phase ends; when the desired accuracy level is not reached, however, then a subsequent iteration of the process shown in FIG. 6 is performed starting at 601 with variations such as, for example, considering a larger collection of raw data. Note: Subsequent iteration corresponds to multi-step).
However, the system Coffman, Toyama, and Le do not explicitly disclose: uses a Monte Carlo method.
Meek teaches, in an analogous system: uses a Monte Carlo method (For example, if the desired multi-step forecast is at three time steps in the future, two intermediate forecasts will be made prior to performing the desired multi-step forecast. At 970, an appropriate leaf is located and evaluated at 980, similar to the evaluation at 940, to provide a prediction value at a time step that precedes the time associated with the desired forecast. A plurality of evaluations can be employed by a computationally efficient Monte Carlo approach [Column 26, lines 40-48]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman, Toyama, and Le to incorporate the teachings of Meek to use Monte Carlo. One would have been motivated to do this modification because doing so would give the benefit of employing a plurality of evaluations by a computationally efficient approach as taught by Meek paragraph [Column 26, paragraph 3].

Regarding claim 16
The system Coffman, Toyama, and Le teaches: The system of claim 9 (As shown above).
Coffman further teaches: wherein performing a multi-step forecast of the predicted future probability distribution to create a predicted probability density function … to perform the multi-step forecast ([98]  In some instances, machine learning models selected to build the predictive engine at 611, are further evaluated using an unseen test dataset 609. Thus, the predictive engine built at 611 generates classification values and/or predicted values at 613. Classification and/or prediction values are evaluated at 615 to determine whether such values have achieved a desired accuracy level. When such a desired accuracy level is reached, the training phase ends; when the desired accuracy level is not reached, however, then a subsequent iteration of the process shown in FIG. 6 is performed starting at 601 with variations such as, for example, considering a larger collection of raw data. Note: Subsequent iteration corresponds to multi-step).
However, the system Coffman, Toyama, and Le do not explicitly disclose: uses a Monte Carlo method.
Meek teaches, in an analogous system: uses a Monte Carlo method (For example, if the desired multi-step forecast is at three time steps in the future, two intermediate forecasts will be made prior to performing the desired multi-step forecast. At 970, an appropriate leaf is located and evaluated at 980, similar to the evaluation at 940, to provide a prediction value at a time step that precedes the time associated with the desired forecast. A plurality of evaluations can be employed by a computationally efficient Monte Carlo approach [Column 26, lines 40-48]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined teachings of Coffman, Toyama, and Le to incorporate the teachings of Meek to use Monte Carlo. One would have been motivated to do this modification because doing so would give the benefit of employing a plurality of evaluations by a computationally efficient approach as taught by Meek paragraph [Column 26, paragraph 3].

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Ide (US 20130262013 A1) discloses INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM, wherein an information processing device includes a sensor that measures predetermined data, a model storage unit that stores a model obtained by modeling time series data measured in the past, an information amount computation unit that computes an information amount obtained from measurement based on the difference of an information amount when measurement by the sensor is not performed which is decided based on a prior distribution of state variables of the model and an information amount when measurement by the sensor is performed which is decided based on a posterior distribution of the state variables of the model, and a measurement control unit that controls the sensor based on the information amount obtained from the measurement.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHAITANYA RAMESH JAYAKUMAR whose telephone number is (571)272-3369. The examiner can normally be reached Mon-Fri 7am-1pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHAITANYA R JAYAKUMAR/Examiner, Art Unit 2128                                                                                                                                                                                                        
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128