DETAILED ACTION
Claims 1, 3-17 and 19-22 (filed 10/05/2022) have been considered in this action.  Claims 1, 3-4, 6-7, 9-17 and 19-20 have been amended.  Claims 2 and 18 have been canceled.  Claim 5 is presented in the same format as previously presented.  Claims 21-22 are newly filed. 

Response to Arguments
Applicant’s arguments, see page 16 paragraph 3, filed 10/05/2022, with respect to objections to the specification have been fully considered and are persuasive.  The objections of the specification has been withdrawn. 

Applicant’s arguments, see page 17 paragraph 3, filed 10/05/2022, with respect to rejection of claims 1-16 under 35 U.S.C. 112(b) have been fully considered and are persuasive.  The rejection of claims 1-16 under 35 U.S.C. 112(b) has been withdrawn. 

Applicant’s arguments, see page 18 paragraph 2, filed 10/05/2022, with respect to rejection of claims 1-11 and 13-20 under 35 U.S.C. 101 have been fully considered and are persuasive.  The rejection of claims 1-11 and 13-20 under 35 U.S.C. 101 has been withdrawn. 

Applicant’s arguments with respect to claim(s) rejection of claims 1-6, 9 and 12-15 under 35 U.S.C. 102 and claims 7-8, 10-11 and 16-20 under 35 U.S.C. 103 with Cay as the primary reference have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Objections
Claim 14 is objected to because of the following informalities:  Claim 14 refers to “the plurality of machine learning models”, while in order to have proper antecedent basis, should be written “the plurality of homogenous inverted machine learning models” to remain consistent with the language of claim 13.  Appropriate correction is required.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 3-16 and 21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1 and 13 each contain a form of the limitations:
wherein the first inverted machine learning model and the second inverted machine learning model share a model architecture and are each trained using a different set of data, wherein the first inverted machine learning model and the second inverted machine learning model are each trained to determine, based on the expected output data, a first set of input data for configuring the manufacturing process and a second set of input data for configuring the manufacturing process, respectively
obtaining, by the processing device, the first set of input data and the second set of input data, wherein obtaining the first set of input data and the second set of input data comprises determining, using the first inverted machine learning model, the first set of input data
These limitations are contradictory and thus unclear.  The contradictory nature of the limitations stems from the fact that limitation (1) states that the second inverted machine learning model determines the second set of input data (via the use of ‘respectively’) while limitation (2) states that the first inverted machine learning model determines the second set of input data.  It is therefore confusing and unclear whether the first inverted machine learning model or the second inverted machine learning model is that which determines the second set of input data because these limitations suggest conflicting sources providing the determination of the second set of input data.  From these statements it is unclear whether the second inverted machine learning model is required as its functionality is dubious.  For the sake of compact prosecution, the examiner shall consider that the second set of input data is determined by either a first inverted machine learning model or a second inverted machine learning model. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1, 3-9 and 12-16 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Cay et al. (US 11055639, hereinafter Cay) in view of Tristan et al. (US 20190095805, hereinafter Tristan).

In regards to Claim 1, Cay teaches “A method comprising: receiving, by a processing device, expected output data for a manufacturing process, wherein the expected output data defines an attribute of an output of the manufacturing process;” ([col 29 line 65] FIG. 13 is a flow chart of an example of a process for optimizing a manufacturing process using a machine learning model according to some aspects. Prior to executing this process, an operator overseeing a manufacturing process for an object may select a target characteristic to optimize via the process. The target characteristic may be a characteristic of the object, such as a dimension (e.g., length, width, height, depth, curvature, radius, or diameter), shape, color, quality, or price of the object...There may be one or more configurable settings of the manufacturing process that can be tuned to optimize the target characteristic. Examples of such configurable settings can include a temperature applied during the manufacturing process, an amount of pressure applied during the manufacturing process, an amount of a chemical deposited on a substrate during the manufacturing process, a ratio of substances used in a mixture during the manufacturing process, etc. To determine a recommended set of values for the configurable settings that optimizes the target characteristic) “accessing, by the processing device, a plurality of...inverted machine learning models that model the manufacturing process, wherein the plurality of ... inverted machine learning models comprise a first inverted machine learning model and a second inverted machine learning model” (Fig. 15 and [col 1 line 14] The present disclosure relates generally to optimizing processes using one or more machine learning models. More specifically, but not by way of limitation, this disclosure relates to optimizing a manufacturing process using one or more machine learning models; [col 1 line 56]  The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function; [col 4 line 24] a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible) “wherein the first inverted machine learning model and the second inverted machine learning model ... are each trained using a different set of data, wherein the first inverted machine learning model and the second inverted machine learning model are each trained to determine, based on the expected output data, a first set of input data for configuring the manufacturing process and a second set of input data for configuring the manufacturing process, respectively, and wherein the first set of input data and the second set of input data each comprise a respective value for a first input and a respective value for a second input” ([col 30 line 27] Referring now to block 1300, a processing device can receive historical data relating to a manufacturing process for an object. The historical data can include values of configurable settings used in past runs of the manufacturing process. The historical data can also include values of the target characteristic (to be optimized) resulting from those past runs. For example, the manufacturing process can be for manufacturing a curved optical lens using a heated press. A computer system associated with the manufacturing process may automatically store temperature and pressure settings used in manufacturing each optical lens, along with a resultant curvature of the optical lens. The computing system may automatically store this information each time an optical lens is manufactured using the manufacturing process. That information can build up over time, for example over the course of several months or years, to thereby form the historical data...The training data can specify relationships between (i) a first set of values for the configurable settings of the manufacturing process and (ii) a second set of values for the target characteristic to be optimized, where the second set of values resulted from using the first plurality of values to perform the manufacturing process; [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “obtaining, by the processing device, the first set of input data and the second set of input data, wherein obtaining the first set of input data and the second set of input data comprises determining, using the first inverted machine learning model, the first set of input data” ([col 4 line 24]  a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible; [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “combining by the processing device, the first set of input data with the second set of input data to produce a set of manufacturing process inputs, wherein the set of manufacturing process inputs comprises a first plurality of candidate values for the first input and a second plurality of candidate values for the second input” ([col 4 line 5] The optimization model and the machine learning models can cooperate with one another to determine a recommended set of values for configurable settings of the manufacturing process. The recommended set of values can be the combination of values for the configurable settings that best meets a user-defined goal (e.g., a particular quality level or price point), as compared to all of the other combinations of values analyzed during the optimization process. In some examples, the recommended set of values can be the optimal set of values as determined by the optimization process; wherein forming a set is considered combining; [col 32 line 60] The iterative process can begin at block 1402, in which a processing device executing the optimization model can select a current set of candidate values for the configurable settings to be used in the current iteration of the iterative process. The current set of candidate values can be selected from within a current region of a search space defined by the optimization model. For example, the optimization model can determine an initial search space to consider in identifying the recommended set of values for the configurable settings) “and storing, by the processing device, the set of manufacturing process inputs in a storage device” ([col 33 line 22]  the process can continue to block 1406, where the current set of candidate values is stored in volatile memory).
Cay fails to explicitly teach that the plurality of inverted machine learning models are homogenous and share the same model architecture.  It is noted that according to applicant’s specification, homogenous machine learning models and those that share the same model architecture are equivalent, and refers to the type of model being the same.  
Tristan teaches “a plurality of homogenous ... machine learning models” (Fig. 1 and  [0006] such an ensembled decision system and training method provides a number of advantages. First, use of the hashing technique reduces the complexity of the resulting decision system (e.g., decision trees), which reduces computing resource requirements both during training and in the field; [0025] in some types of decision models where the size of the input feature sets impacts the model complexity, for example decision tree models where the number of levels correspond to the number of input features, use of the hashing technique reduces the complexity of the model itself. Thus, the resulting models may be much smaller in size, adding to the runtime benefits of feature hashing; [0027]  Ensembles of decision models may be trained using ensembled learning techniques. For example, the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0007] In distributed computing environments, the ensembled decision system may be architected to split the work among distinct nodes of the distributed system, while ensuring that each individual decision model on a given node runs as fast as possible; Tristan teaches that a single decision tree model through hashing is split into several decision tree models, although other model types are also understood) “wherein the first inverted machine learning model and the second inverted machine learning model share a model architecture and are each trained using a different set of data” ([0027]the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0028] In some embodiments, a bootstrap aggregation (abbreviated “bagging”) technique may be used. In a bagging process, a number n of “bootstrap” data sets is created from the initial training data set. Each bootstrap data set may be used to train one decision model. In some embodiments, to obtain a bootstrap set, the training data set is sampled uniformly in a pseudorandom fashion. The sampling may be performed “with replacement,” that is, the sampling permits the same data record to be repeated during training. In some embodiments, the bagging method reduces the variance of linear regression algorithms and the accuracy of decision models such as classifiers. The pseudorandom sampling also speeds up the training process and ensures that each decision model is exposed to different portions of training data and injects a degree of independence to each of the models) “combining by the processing device, the first set of input data with the second set of input data to produce a set of manufacturing process inputs” ([0004] The decision models may each perform a hashing technique on the input data to produce a respective feature vector from the input data, reducing the feature space dimensionality of the models. The decision models make respective decisions based on their feature vector. The respective decisions are then combined using a combining function to produce an ultimate decision of the decision system. In some embodiments, the combining function may implement a simple vote of the collection of decision models).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that uses a plurality of inverted machine learning models that attain sets of process parameters for configuring a manufacturing process, each of which are trained using different sets of training data as taught by Cay, with the feature of the hashing system that uses a plurality of homogenous machine learning models which are trained in a bagging ensemble and whose outputs are combined to form a set of outputs as taught by Tristan because it would infer the stated benefit of Tristan, namely improved run time and resource usage because the size of the models are kept from becoming so large that the computing resources become bogged down (Tristan at least [0003], [0006], [0007]).  Furthermore, while Cay uses multiple inverted machine learning models, there is little discussion about what architecture is specifically used by the plurality of machine learning models, thus the use of the same type of model can be a mere design choice of a person familiar with ensemble learning machine learning systems.  In addition, the use of different training data for each model would likewise gain the stated benefit of Tristan in that each model would be independent, thus improving overall predictive accuracy ([0008]). It is well known by persons having ordinary skill in the machine learning arts that different machine learning model architectures have different advantages and disadvantages, thus the choice of a specific model architecture cannot form an inventive concept.  It would be understood by that person having ordinary skill in the art that the benefits of using multiple models trained with different data but sharing an architecture would improve the predictive nature of the modeling system while staying within performance criteria for the computing system executing the models, as noted by Tristan ([0006-0008]).  By combining these elements, it can be considered taking the known feature of using homogenous sets of machine learning models that share the same model architecture and model type but are trained with different data sets as taught by Tristan, and using it to improve the machine learning system that uses a plurality of models trained with different data sets to obtain sets of manufacturing configuration parameters, to achieve the predictable result of using a plurality of homogenous machine learning models, each trained with a different training set and outputs sets of configuration parameters for a manufacturing process that are combined and selected from the combination.  

In regards to Claim 3, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, further comprising clustering, by the processing device, sets of input data for configuring the manufacturing process into a plurality of groups, wherein each group of the plurality of groups comprises a respective value for the first input and a respective value for the second input” ([ col 2 line 11] The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function. Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; providing the current set of candidate values as input to a trained machine learning model that is separate from the optimization model, the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values;[col 26 line 7] FIG. 11 is a flow chart of an example of a process for generating and using a machine learning model according to some aspects. Machine learning is a branch of artificial intelligence that relates to mathematical models that can learn from, categorize, and make predictions about data. Such mathematical models, which can be referred to as machine learning models, can classify input data among two or more classes; cluster input data among two or more groups; [col 27 line 64] In block 1112, the trained machine learning model is used to analyze the new data and provide a result. For example, the new data can be provided as input to the trained machine learning model. The trained machine learning model can analyze the new data and provide a result that includes a classification of the new data into a particular class, a clustering of the new data into a particular group, a prediction based on the new data, or any combination of these).

In regards to Claim 4, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, wherein the input data for the manufacturing process comprises a set of configuration values, and wherein the set of configuration values comprises at least one of a time value, a temperature value, a pressure value, a voltage value, or a gas flow value” ([col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values;).

In regards to Claim 5, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, wherein the expected output data for the manufacturing process comprises one or more values that indicate a layer thickness, a layer uniformity, or a structural width of a product that will be output by the manufacturing process” ([col 36 line 24]  In the above rear-view mirror example, the optimization model was run for different values of glass thickness and hoop cycles, yielding a total of more than 700 combinations. Some examples of recommended oven temperatures associated with different glass thicknesses and different numbers of hoop cycles are shown in FIGS. 17-18, respectively. In particular, FIG. 17 shows a graph with recommended oven temperatures along the Y-axis and glass thicknesses along the X-axis. The graph includes seven lines representing the different recommended temperatures of the seven industrial ovens for different glass thicknesses.).

In regards to Claim 6, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, wherein the first plurality of input values comprises a range of values for the first input and the second plurality of values comprises a range of values for the second input” (Fig. 17 and [col 36 line 34] In the above rear-view mirror example, the optimization model was run for different values of glass thickness and hoop cycles, yielding a total of more than 700 combinations. Some examples of recommended oven temperatures associated with different glass thicknesses and different numbers of hoop cycles are shown in FIGS. 17-18, respectively. In particular, FIG. 17 shows a graph with recommended oven temperatures along the Y-axis and glass thicknesses along the X-axis. The graph includes seven lines representing the different recommended temperatures of the seven industrial ovens for different glass thicknesses. FIG. 18 shows a graph with recommended oven temperatures along the Y-axis and hoop cycles along the X-axis. The graph includes seven lines representing the different recommended temperatures of the seven industrial ovens for different numbers of hoop cycles; wherein fig. 17 shows that there are ranges of values; [col 2 line 11] The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function. Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; providing the current set of candidate values as input to a trained machine learning model that is separate from the optimization model, the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values).

In regards to Claim 7, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Tristan further teaches “The method of claim 1, wherein each set of data comprises at least one of a different hyperparameter, a different initialization value, or different training data” ([0008] Depending on the embodiment, optimizations may be made during the training process of such a decision system. As one example, the training process may employ a “bootstrap aggregation” or “bagging” technique, in which the ensemble of decision models are trained using a random subsample of the training data set. In some embodiments, some of the decision models may be trained using only certain subsets of features in the training data. Such techniques are useful to inject some degree of variance into the training of the different decision models, which improves the overall accuracy of the decision system).

In regards to Claim 8, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Tristan further teaches “The method of claim 1, wherein the combining comprises using an ensemble technique to combine output of the plurality of machine learning models” ([0005] the decision system may be trained using an ensemble learning technique. In some embodiments, the collection of decision models and combining function are initially selected based on a set of performance requirements of the decision system and resources available to train the models. In some embodiments, the training data may be divided among the different models. In other embodiments, the training data may be shared among the models using data subsampling functions. The models are then trained in parallel using machine learning techniques. Because each model employs the hashing technique, they may be trained in a feature space with lower dimensionality, thereby saving processing power and memory usage on the training machine. To reduce any errors that are produced by the hashing technique, the models are combined to form an ensemble, where the decisions of the models are combined to produce the ultimate decision of the system. In some embodiments, the combining function may implement as a simple vote by the individual decision models. In some embodiments, the combining function may comprise another model that is itself trained using machine learning techniques).  Cay additionally teaches ([col 26 line 17] Examples of machine learning models can include.... and (vi) ensembles or other combinations of machine learning models).

In regards to Claim 9, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, wherein the plurality of homogenous inverted machine learning models comprises a plurality of Feed Forward Neural Networks (FFNN)” ([col 26 line 28] In some examples, neural networks can include...feed-forward neural networks; [col 28 line 44] In some examples, the neural network 1200 is a feed-forward neural network; [col 31 line 24] the optimization model can be a hybrid model that employs multiple search algorithms to identify the recommended set of values for the configurable settings. For example, the optimization model can employ a Latin Hypercube Sampling (LHS) algorithm, a Genetic Algorithm (GA), a Generating Set Search (GSS), or any combination of these to effectuate the iterative process. In one particular example, the optimization model can begin with a LHS of the search space to determine possible setting values (values for the configurable settings). From these initial setting values, the GA can begin an iterative process in which it performs crossover operations and random-mutation operations to generate new setting values to try. The crossover operations can use the setting values from promising solutions as parents, such that combinations of these parent values are used to create children for the next iteration. This may help ensure that the optimization model exploits promising regions of the search space. The mutation operations can create random perturbations of the setting values to help ensure exploration of the search space, where the newly created perturbations are evaluated in the next iteration of the optimization model. The iterative process of the GA can continue until the evaluation budget has expired or the solution has stalled and is no longer improving. Within each iteration of the GA, a local pattern search algorithm such as GSS can also be used to refine the best-known solution by generating setting values in the local neighborhood of the best-known solution; wherein each iteration of the model is a different model producing different sets of inputs).

In regards to Claim 12, Cay and Tristan teach the method of obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Cay further teaches “The method of claim 1, further comprising: providing for display a plurality of candidate input value sets, wherein each candidate input value set of the plurality of candidate input value sets corresponds to the expected output data for the manufacturing process; receiving a user selection of a candidate input value set of the plurality of candidate value input sets to obtain a selected candidate input value set; and initiating a run of the manufacturing process using the selected candidate input value set” ([col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values...As another example, the processing device can transmit the electronic communication over a network to a remote user device (e.g., a laptop computer, mobile phone, or tablet) associated with an operator of the manufacturing process. The user device can receive the electronic communication and responsively output the recommended set of values on a display device to the operator, who may be located on the manufacturing floor or otherwise close to a control panel associated with the manufacturing process. Based on the output, the operator can adjust the configurable settings to the recommended set of values to improve the manufacturing process. As still another example, the electronic communication can be a display signal for generating a graphical user interface on a display device, such as a touch-screen display or a liquid crystal display. The graphical user interface can include the recommended set of values. An operator of the manufacturing process can view the graphical user interface on the display device and tune the configurable settings to the recommended set of values, to improve the manufacturing process).

In regards to Claim 13, Cay discloses “A system comprising: a memory; and a processing device communicably coupled to the memory” ([col 1 line 52] One example of the present disclosure can include a system having one or more processing devices and one or more memory devices including instructions that are executable by the one or more processing devices for causing the one or more processing devices to perform operations) “the processing device to: receive, expected output data for a manufacturing process, wherein the expected output data defines an attribute of an output of the manufacturing process” ([col 29 line 65] FIG. 13 is a flow chart of an example of a process for optimizing a manufacturing process using a machine learning model according to some aspects. Prior to executing this process, an operator overseeing a manufacturing process for an object may select a target characteristic to optimize via the process. The target characteristic may be a characteristic of the object, such as a dimension (e.g., length, width, height, depth, curvature, radius, or diameter), shape, color, quality, or price of the object; wherein a target characteristic is considered an expected output) “access a plurality of...inverted machine learning models that model the manufacturing process, wherein the plurality of ... inverted machine learning models comprise a first inverted machine learning model and a second inverted machine learning model” (Fig. 15 and [col 1 line 14] The present disclosure relates generally to optimizing processes using one or more machine learning models. More specifically, but not by way of limitation, this disclosure relates to optimizing a manufacturing process using one or more machine learning models; [col 1 line 56]  The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function; [col 4 line 24] a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible) “wherein the first inverted machine learning model and the second inverted machine learning model ... are each trained using a different set of data, wherein the first inverted machine learning model and the second inverted machine learning model are each trained to determine, based on the expected output data, a first set of input data for configuring the manufacturing process and a second set of input data for configuring the manufacturing process, respectively, and wherein the first set of input data and the second set of input data each comprise a respective value for a first input and a respective value for a second input” ([col 30 line 27] Referring now to block 1300, a processing device can receive historical data relating to a manufacturing process for an object. The historical data can include values of configurable settings used in past runs of the manufacturing process. The historical data can also include values of the target characteristic (to be optimized) resulting from those past runs. For example, the manufacturing process can be for manufacturing a curved optical lens using a heated press. A computer system associated with the manufacturing process may automatically store temperature and pressure settings used in manufacturing each optical lens, along with a resultant curvature of the optical lens. The computing system may automatically store this information each time an optical lens is manufactured using the manufacturing process. That information can build up over time, for example over the course of several months or years, to thereby form the historical data...The training data can specify relationships between (i) a first set of values for the configurable settings of the manufacturing process and (ii) a second set of values for the target characteristic to be optimized, where the second set of values resulted from using the first plurality of values to perform the manufacturing process; [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “obtain the first set of input data and the second set of input data, wherein to obtain the first set of input data and the second set of input data, the processing device is to determine, using the first inverted machine learning model, the first set of input data” ([col 4 line 24]  a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible; [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “combine the first set of input data with the second set of input data to produce a set of manufacturing process inputs, wherein the set of manufacturing process inputs comprises a first plurality of candidate values for the first input and a second plurality of candidate values for the second input” ([col 4 line 5] The optimization model and the machine learning models can cooperate with one another to determine a recommended set of values for configurable settings of the manufacturing process. The recommended set of values can be the combination of values for the configurable settings that best meets a user-defined goal (e.g., a particular quality level or price point), as compared to all of the other combinations of values analyzed during the optimization process. In some examples, the recommended set of values can be the optimal set of values as determined by the optimization process; wherein forming a set is considered combining; [col 32 line 60] The iterative process can begin at block 1402, in which a processing device executing the optimization model can select a current set of candidate values for the configurable settings to be used in the current iteration of the iterative process. The current set of candidate values can be selected from within a current region of a search space defined by the optimization model. For example, the optimization model can determine an initial search space to consider in identifying the recommended set of values for the configurable settings) “and store the set of manufacturing process inputs in a storage device” ([col 33 line 22]  the process can continue to block 1406, where the current set of candidate values is stored in volatile memory).
Cay fails to explicitly teach that the plurality of inverted machine learning models are homogenous and share the same model architecture.  It is noted that according to applicant’s specification, homogenous machine learning models and those that share the same model architecture are equivalent, and refers to the type of model being the same.  
Tristan teaches “a plurality of homogenous ... machine learning models” (Fig. 1 and  [0006] such an ensembled decision system and training method provides a number of advantages. First, use of the hashing technique reduces the complexity of the resulting decision system (e.g., decision trees), which reduces computing resource requirements both during training and in the field; [0025] in some types of decision models where the size of the input feature sets impacts the model complexity, for example decision tree models where the number of levels correspond to the number of input features, use of the hashing technique reduces the complexity of the model itself. Thus, the resulting models may be much smaller in size, adding to the runtime benefits of feature hashing; [0027]  Ensembles of decision models may be trained using ensembled learning techniques. For example, the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0007] In distributed computing environments, the ensembled decision system may be architected to split the work among distinct nodes of the distributed system, while ensuring that each individual decision model on a given node runs as fast as possible; Tristan teaches that a single decision tree model through hashing is split into several decision tree models, although other model types are also understood) “wherein the first inverted machine learning model and the second inverted machine learning model share a model architecture and are each trained using a different set of data” ([0027]the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0028] In some embodiments, a bootstrap aggregation (abbreviated “bagging”) technique may be used. In a bagging process, a number n of “bootstrap” data sets is created from the initial training data set. Each bootstrap data set may be used to train one decision model. In some embodiments, to obtain a bootstrap set, the training data set is sampled uniformly in a pseudorandom fashion. The sampling may be performed “with replacement,” that is, the sampling permits the same data record to be repeated during training. In some embodiments, the bagging method reduces the variance of linear regression algorithms and the accuracy of decision models such as classifiers. The pseudorandom sampling also speeds up the training process and ensures that each decision model is exposed to different portions of training data and injects a degree of independence to each of the models) “combining by the processing device, the first set of input data with the second set of input data to produce a set of manufacturing process inputs” ([0004] The decision models may each perform a hashing technique on the input data to produce a respective feature vector from the input data, reducing the feature space dimensionality of the models. The decision models make respective decisions based on their feature vector. The respective decisions are then combined using a combining function to produce an ultimate decision of the decision system. In some embodiments, the combining function may implement a simple vote of the collection of decision models).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that uses a plurality of inverted machine learning models that attain sets of process parameters for configuring a manufacturing process, each of which are trained using different sets of training data as taught by Cay, with the feature of the hashing system that uses a plurality of homogenous machine learning models which are trained in a bagging ensemble and whose outputs are combined to form a set of outputs as taught by Tristan because it would infer the stated benefit of Tristan, namely improved run time and resource usage because the size of the models are kept from becoming so large that the computing resources become bogged down (Tristan at least [0003], [0006], [0007]).  Furthermore, while Cay uses multiple inverted machine learning models, there is little discussion about what architecture is specifically used by the plurality of machine learning models, thus the use of the same type of model can be a mere design choice of a person familiar with ensemble learning machine learning systems.  In addition, the use of different training data for each model would likewise gain the stated benefit of Tristan in that each model would be independent, thus improving overall predictive accuracy ([0008]). It is well known by persons having ordinary skill in the machine learning arts that different machine learning model architectures have different advantages and disadvantages, thus the choice of a specific model architecture cannot form an inventive concept.  It would be understood by that person having ordinary skill in the art that the benefits of using multiple models trained with different data but sharing an architecture would improve the predictive nature of the modeling system while staying within performance criteria for the computing system executing the models, as noted by Tristan ([0006-0008]).  By combining these elements, it can be considered taking the known feature of using homogenous sets of machine learning models that share the same model architecture and model type but are trained with different data sets as taught by Tristan, and using it to improve the machine learning system that uses a plurality of models trained with different data sets to obtain sets of manufacturing configuration parameters, to achieve the predictable result of using a plurality of homogenous machine learning models, each trained with a different training set and outputs sets of configuration parameters for a manufacturing process that are combined and selected from the combination.  

In regards to Claim 14, Cay and Tristan teach the system for obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 13 above.
Cay further teaches “The system of claim 13, wherein the plurality of machine learning models comprises a plurality of Feed Forward Neural Networks (FFNN)” ([col 26 line 28] In some examples, neural networks can include...feed-forward neural networks; [col 28 line 44] In some examples, the neural network 1200 is a feed-forward neural network; [col 31 line 24] the optimization model can be a hybrid model that employs multiple search algorithms to identify the recommended set of values for the configurable settings. For example, the optimization model can employ a Latin Hypercube Sampling (LHS) algorithm, a Genetic Algorithm (GA), a Generating Set Search (GSS), or any combination of these to effectuate the iterative process. In one particular example, the optimization model can begin with a LHS of the search space to determine possible setting values (values for the configurable settings). From these initial setting values, the GA can begin an iterative process in which it performs crossover operations and random-mutation operations to generate new setting values to try. The crossover operations can use the setting values from promising solutions as parents, such that combinations of these parent values are used to create children for the next iteration. This may help ensure that the optimization model exploits promising regions of the search space. The mutation operations can create random perturbations of the setting values to help ensure exploration of the search space, where the newly created perturbations are evaluated in the next iteration of the optimization model. The iterative process of the GA can continue until the evaluation budget has expired or the solution has stalled and is no longer improving. Within each iteration of the GA, a local pattern search algorithm such as GSS can also be used to refine the best-known solution by generating setting values in the local neighborhood of the best-known solution; wherein each iteration of the model is a different model producing different sets of inputs).

In regards to Claim 15, Cay and Tristan teach the system for obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 15 above.
Cay further teaches “The method of claim 1, wherein the first plurality of input values comprises a range of values for the first input and the second plurality of values comprises a range of values for the second input” (Fig. 17 and [col 36 line 34] In the above rear-view mirror example, the optimization model was run for different values of glass thickness and hoop cycles, yielding a total of more than 700 combinations. Some examples of recommended oven temperatures associated with different glass thicknesses and different numbers of hoop cycles are shown in FIGS. 17-18, respectively. In particular, FIG. 17 shows a graph with recommended oven temperatures along the Y-axis and glass thicknesses along the X-axis. The graph includes seven lines representing the different recommended temperatures of the seven industrial ovens for different glass thicknesses. FIG. 18 shows a graph with recommended oven temperatures along the Y-axis and hoop cycles along the X-axis. The graph includes seven lines representing the different recommended temperatures of the seven industrial ovens for different numbers of hoop cycles; wherein fig. 17 shows that there are ranges of values; [col 2 line 11] The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function. Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; providing the current set of candidate values as input to a trained machine learning model that is separate from the optimization model, the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values).

In regards to Claim 16, Cay and Tristan teach the system for obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 1 above.
Tristan further teaches “The method of claim 1, wherein each different set of data comprises at least one of a different hyperparameter, a different initialization value, or different training data” ([0008] Depending on the embodiment, optimizations may be made during the training process of such a decision system. As one example, the training process may employ a “bootstrap aggregation” or “bagging” technique, in which the ensemble of decision models are trained using a random subsample of the training data set. In some embodiments, some of the decision models may be trained using only certain subsets of features in the training data. Such techniques are useful to inject some degree of variance into the training of the different decision models, which improves the overall accuracy of the decision system).


In regards to Claim 21, Cay and Tristan teach the system for obtaining manufacturing parameters using a plurality of homogenous machine learning models as taught by claim 13 above.
Cay further teaches “The system of claim 13, wherein the processing device is further to: provide for display a plurality of candidate input value sets, wherein each candidate input value set of the plurality of candidate input value sets corresponds to the expected output data for the manufacturing process; receive a user selection of a candidate input value set of the plurality of candidate input value sets to obtain a selected candidate input value set; and initiate a run of the manufacturing process using the selected candidate input value set” ([col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values...As another example, the processing device can transmit the electronic communication over a network to a remote user device (e.g., a laptop computer, mobile phone, or tablet) associated with an operator of the manufacturing process. The user device can receive the electronic communication and responsively output the recommended set of values on a display device to the operator, who may be located on the manufacturing floor or otherwise close to a control panel associated with the manufacturing process. Based on the output, the operator can adjust the configurable settings to the recommended set of values to improve the manufacturing process. As still another example, the electronic communication can be a display signal for generating a graphical user interface on a display device, such as a touch-screen display or a liquid crystal display. The graphical user interface can include the recommended set of values. An operator of the manufacturing process can view the graphical user interface on the display device and tune the configurable settings to the recommended set of values, to improve the manufacturing process).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Cay and Tristan as applied to claim 1 above, and further in view of Ma et al. (Constructive Feedforward Neural Networks Using Hermite Polynomial Activation Functions).

In regards to Claim 10, Cay and Tristan teach a system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as incorporated by claim 1 above.
Cay further teaches “The method of claim 1, wherein the first machine learning model comprises a Feed Forward Neural Network that comprises an output layer and a plurality of hidden layers to model the manufacturing process...linear activation function” ([col 26 line 28] In some examples, neural networks can include...feed-forward neural networks; [col 28 line 44] In some examples, the neural network 1200 is a feed-forward neural network; [col 28 line 60]  the neural network 1200 operates by receiving a vector of numbers from one layer; transforming the vector of numbers into a new vector of numbers using a matrix of numeric weights, a nonlinearity, or both; and providing the new vector of numbers to a subsequent layer of the neural network 1200...The neural network 1200 can transform the weighted vector using a nonlinearity, such as a sigmoid tangent or the hyperbolic tangent. In some examples, the nonlinearity can include a rectified linear unit, which can be expressed using the following equation: y=max(x,0) where y is the output and x is an input value from the weighted vector. The transformed output can be supplied to a subsequent layer, such as the hidden layer 1204; [col 4 line 41] during each iteration of the optimization model, the optimization model can first determine a current set of values for the configurable settings to analyze. In a typical optimization process, the optimization model may next input the current set of values to an objective function that is a predefined linear equation. But in some examples described herein, the optimization model can instead provide the current set of values as input to one or more trained machine learning models that are separate from the optimization model; [col 5 line 10] The optimization model can then input the current set of values to an objective function that needs to be optimized subject to a predefined constraint. In a typical optimization process, the predefined constraint may be a linear equation in which one expression has a predefined relationship to another expression).
Cay and Tristan fails to teach ““...and wherein the plurality of hidden layers comprises a polynomial function and the output layer comprises a linear activation function”.
Ma teaches “...and wherein the plurality of hidden layers comprises a polynomial function and the output layer comprises a linear activation function” ([page 822 col 1] In this paper, an incremental adaptive constructive structure of a FNN [25], [26], [31] is considered. OHL-FNNs with both linear and nonlinear output layers are utilized here...During the construction process in our proposed scheme, the hidden units are added to the active network one at a time, and the activation function of the hidden units are assigned successively from the lowest order orthonormal Hermite polynomial to the higher order ones; wherein FNN is feedforward neural network).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process including feedforward neural networks with hidden and output layers as taught by Cay, with the use of polynomial functions in the hidden layer and a linear activation function in the output layer as taught by Ma because by incorporating these features the various feedforward neural networks of Cay would gain improved accuracy and determination of machining configuration values.  While Cay does not go into detail as to the specific mathematical structures of their feedforward neural network layers, it can be considered that using different mathematically correlated activation function are known and obvious modifications of what is known to a person having ordinary skill in the art of machine learning neural networks, and can be considered a mere design choice.  By combining these elements, it can be considered taking the known use of multiple feedforward neural networks with hidden and output layers that output configuration values for a manufacturing process, and modifying it by utilizing polynomial activation functions in the hidden layers and linear activation functions in the output in a known way to achieve predictable results.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Cay and Tristan as applied to claim 1 above, and further in view of Liano et al. (US 20110320386, hereinafter Liano).

In regards to Claim 11, Cay and Tristan teaches a system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as incorporated by claim 1 above.
Cay and Tristan fails to teach “The method of claim 1, wherein determining, using the first machine learning model, the input data for the manufacturing process comprises executing an inference engine that linearly extrapolates the expected output data of the manufacturing process to identify the input data for the manufacturing process”.
Liano teaches “The method of claim 1, wherein determining, using the first machine learning model, the input data for the manufacturing process comprises executing an inference engine that linearly extrapolates the expected output data of the manufacturing process to identify the input data for the manufacturing process” ([0003] The present disclosure provides novel techniques for defining controllers, predictive systems, and/or optimization systems by utilizing empirical models that are capable of incorporating desired extrapolation properties, such as a candidate basis/kernel function .phi..sub.b(.cndot.), as factors used to determine the structure of the empirical model. Once the model has been defined, the model may then be utilized in controller embodiments, model predictive control embodiments, environmental management embodiments, production performance management embodiments, plant operations optimization embodiments, industrial scheduling systems embodiments, and so forth. [0004] An empirical model may be first defined using the following general equation: f ( x ) = b N B .PHI. b ( w b , x ) ( 1 ) ##EQU00001## where x.epsilon..sup.N.sup.u is the N.sub.u-dimensional input vector, f(.cndot.):.sup.N.sup.u.fwdarw..sup.N.sup.y is a linear or nonlinear mapping from the N.sub.u-dimensional input space to N.sub.y-dimensional output space, w.sub.b is the parameters of the basis/kernel function .phi..sub.b(.cndot.) that are determined in the course of the modeling process, and N.sub.B is the number of the basis/kernel functions used for the approximation; [0019] In certain embodiments, such as neural network embodiments, the equation .phi..sub.b(w.sub.b,x) 32 may be used as a basis/kernel equation as described with more detail below with respect to FIG. 4. In other embodiments, such as support vector machine embodiments, the equation .phi..sub.b(w.sub.b,x) 32 may be used as a kernel/basis function as described with more detail below with respect to FIG. 7. More generally, the equation .phi..sub.b(w.sub.b,x) 32 may be used to express the empirical model 14 in the form f ( x ) = b N B .PHI. b ( w b , x ) , ##EQU00004## as mentioned above, where N.sub.B is the number of the basis/kernel functions used for the approximation of the modeled system 34. A set of inputs x 36 where x.epsilon..sup.N.sup.u is the N.sub.u-dimensional input vector, may be used as inputs into the modeled system 34. The modeled system 34 may then generate a plurality of outputs y 38 where y.epsilon..sup.N.sup.z is the N.sub.y-dimensional output space).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the method of using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as taught by Cay with the use of an extrapolation system that modifies a neural network with a linear extrapolation technique so that outputs of the neural network are based on linearly extrapolated values because it would offer the obvious benefit of having a neural network model that better fits what the real-life system is capable of, even when the training data set for the model does not contain the entire range of input values that are possible, as described by Liano ([0003]-[0006]).  As noted by Liano in paragraph [0002], when a neural network system model encounters inputs (expected output data) that are outside what it has been trained with, the model may not create outputs (input data for manufacturing process) in an effective or desirable way, thus by incorporating these features of Liano the system of Cay would be expect to gain similar advantages.  Furthermore, Liano explicitly states that the extrapolation techniques are useful in manufacturing environments for manufacturing products using neural networks to model the production process ([0015], [0016]), thus putting it into a similar field of use as Cay.  By combining these elements, it can be considered taking the known use of multiple neural networks output configuration values for a manufacturing process using expected/target information as inputs, and improve it by implementing the extrapolation techniques of Liano so that the model is capable of accounting for values outside the training set so that expected model inputs are linearly extrapolated in a known way to achieve predictable results

Claims 17 and 19 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Cay in view of Kapoor et al. (WO2020233992, hereinafter Kapoor) and Tristan et al. (US 20190095805, hereinafter Tristan).

In regards to Claim 17, Cay teaches “A non-transitory machine-readable storage medium storing instructions which, when executed, cause a processing device to perform operations” ([col 2 line 21] Yet another example of the present disclosure includes a non-transitory computer-readable medium comprising program code that is executable by one or more processors for causing the one or more processors to perform operations) “comprising: accessing output training data of a manufacturing process, wherein the output training data is associated with input training data used by the manufacturing process” ([col 29 line 65] FIG. 13 is a flow chart of an example of a process for optimizing a manufacturing process using a machine learning model according to some aspects. Prior to executing this process, an operator overseeing a manufacturing process for an object may select a target characteristic to optimize via the process. The target characteristic may be a characteristic of the object, such as a dimension (e.g., length, width, height, depth, curvature, radius, or diameter), shape, color, quality, or price of the object...There may be one or more configurable settings of the manufacturing process that can be tuned to optimize the target characteristic. Examples of such configurable settings can include a temperature applied during the manufacturing process, an amount of pressure applied during the manufacturing process, an amount of a chemical deposited on a substrate during the manufacturing process, a ratio of substances used in a mixture during the manufacturing process, etc. To determine a recommended set of values for the configurable settings that optimizes the target characteristic; ) “training, based on the output training data and the input training data, a first inverted machine learning model and a second inverted machine learning model...” ([col 2 line 21] The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function. Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; providing the current set of candidate values as input to a trained machine learning model that is separate from the optimization model, the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values; receiving the value as output from the trained machine learning model; and identifying a next region of the search space to use in a next iteration of the iterative process based on the value; [col 34 line 63] In some cases, there can be multiple instances 1516a-n of the same trained machine learning model running in parallel on the one or more servers 1518.  The optimization manager 1512 can transmit a respective set of candidate values to each of the instances 1516a-n so that the respective sets of candidate values can be evaluated in parallel by the instances 1516a-n. For example, the optimization manager 1512 can transmit a respective set of candidate values 1420 to instance 1516a of the trained machine learning model. The optimization manager 1512 can also transmit another respective set of candidate values to instance 1516b of the trained machine learning model. The optimization manager 1512 can further transmit another respective set of candidate values to instance 1516n of the trained machine learning model. The optimization manager 1512 may keep track of which sets of candidate values are transmitted to each of the instances 1516a-n. The instances 1516a-n can determine output values based on the respective sets of candidate values and return the values to the optimization manager 1512. One example of a value 1522 being returned from an instance 1516a is shown in FIG. 15. The optimization manager 1512 can receive the returned values and provide the values back to the optimization model 1504 for subsequent use during the parallel iterations; wherein because the trained machine learning models are the same they must be homogeneous in terms of architecture; [col 4 line 47] the optimization model can instead provide the current set of values as input to one or more trained machine learning models that are separate from the optimization model. The optimization model may communicate with the one or more trained machine learning models via an application programming interface (API). The trained machine learning models can receive the current set of values and generate respective output values based on the current set of values. The output values can include, for example, a predicted value for a target characteristic of the object or a predicted value for the manufacturing process that is to be optimized. The target characteristic may be selected by the user. Examples of a target characteristic of an object can include a size, shape, color, or dimension of the object. Examples of a target characteristic of a manufacturing process can include a cost, an amount of time, or an amount of waste associated with the manufacturing process. After determining the output values, the trained machine learning models can return the output values to the optimization model (e.g., via the API), which can use the output value for the remainder of the current iteration of the optimization process. In this way, the one or more trained machine learning models can serve as a substitute for a typical objective function, which may yield more accurate results from the optimization process than using a typical objective function) “selecting, by a processing device, selected output data and selected input data for the manufacturing process, wherein the selected output data defines an output attribute of the manufacturing process;” ([col 4 line 24]  a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible; [col 29 line 65] FIG. 13 is a flow chart of an example of a process for optimizing a manufacturing process using a machine learning model according to some aspects. Prior to executing this process, an operator overseeing a manufacturing process for an object may select a target characteristic to optimize via the process. The target characteristic may be a characteristic of the object, such as a dimension (e.g., length, width, height, depth, curvature, radius, or diameter), shape, color, quality, or price of the object...There may be one or more configurable settings of the manufacturing process that can be tuned to optimize the target characteristic. Examples of such configurable settings can include a temperature applied during the manufacturing process, an amount of pressure applied during the manufacturing process, an amount of a chemical deposited on a substrate during the manufacturing process, a ratio of substances used in a mixture during the manufacturing process, etc. To determine a recommended set of values for the configurable settings that optimizes the target characteristic [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “determining, using the first inverted machine learning model, a set of input data for configuring the manufacturing process based on the selected output data, wherein the set of input data comprises a configuration value for a first input and a configuration value for a second input;” ([col 4 line 24]  a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function. In a typical optimization model, the objective function is often a predefined linear function. But in some examples of the present disclosure, one or more trained machine learning models can serve as the objective function. Since the machine learning models may be trained on hundreds or thousands of data points, and since the trained machine learning models can more readily capture non-linear relationships between inputs and outputs than a typical objective function, using the trained machine learning models in this way can yield a more accurate set of recommended values for the configurable settings than may otherwise be possible; [col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values. For example, the processing device can transmit the electronic communication to a control system associated with the manufacturing process. The control system can be an electronic or mechanical control system for managing configurable settings of the manufacturing process, such as a temperature, pressure, or power level used in the manufacturing process; wherein temperature, pressure or power level are the input values) “and comparing the selected input data for the manufacturing process and the set of input data” ([col 4 line 5] The optimization model and the machine learning models can cooperate with one another to determine a recommended set of values for configurable settings of the manufacturing process. The recommended set of values can be the combination of values for the configurable settings that best meets a user-defined goal (e.g., a particular quality level or price point), as compared to all of the other combinations of values analyzed during the optimization process. In some examples, the recommended set of values can be the optimal set of values as determined by the optimization process; [col 27 line 32]  The evaluation dataset can include inputs correlated to desired outputs. The inputs can be provided to the machine learning model and the outputs from the machine learning model can be compared to the desired outputs. If the outputs from the machine learning model closely correspond with the desired outputs, the machine learning model may have a high degree of accuracy. For example, if 90% or more of the outputs from the machine learning model are the same as the desired outputs in the evaluation dataset, the machine learning model may have a high degree of accuracy. Otherwise, the machine learning model may have a low degree of accuracy. The 90% number is an example only. A realistic and desirable accuracy percentage is dependent on the problem and the data).
Cay fails to teach the missing portion not taught above in the limitation “wherein the first inverted machine learning model and the second inverted machine learning model are homogenous models that share a model architecture and are each trained using a different hyperparameter”.
Tristan teaches “wherein the first inverted machine learning model and the second inverted machine learning model are homogenous models that share a model architecture” (Fig. 1 and  [0006] such an ensembled decision system and training method provides a number of advantages. First, use of the hashing technique reduces the complexity of the resulting decision system (e.g., decision trees), which reduces computing resource requirements both during training and in the field; [0025] in some types of decision models where the size of the input feature sets impacts the model complexity, for example decision tree models where the number of levels correspond to the number of input features, use of the hashing technique reduces the complexity of the model itself. Thus, the resulting models may be much smaller in size, adding to the runtime benefits of feature hashing; [0027]  Ensembles of decision models may be trained using ensembled learning techniques. For example, the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0007] In distributed computing environments, the ensembled decision system may be architected to split the work among distinct nodes of the distributed system, while ensuring that each individual decision model on a given node runs as fast as possible; Tristan teaches that a single decision tree model through hashing is split into several decision tree models, although other model types are also understood; [0027]the decision system 100 of FIG. 1 includes three decision models 122, 124, and 126, which may be configured to make the same decision. Given a decision task, all decision models may be trained during ensembled training. The individual results of the decision models 122, 124, and 126 may then be combined using a results combiner 132 to produce a final decision 140 of the decision system 100; [0028] In some embodiments, a bootstrap aggregation (abbreviated “bagging”) technique may be used. In a bagging process, a number n of “bootstrap” data sets is created from the initial training data set. Each bootstrap data set may be used to train one decision model. In some embodiments, to obtain a bootstrap set, the training data set is sampled uniformly in a pseudorandom fashion. The sampling may be performed “with replacement,” that is, the sampling permits the same data record to be repeated during training. In some embodiments, the bagging method reduces the variance of linear regression algorithms and the accuracy of decision models such as classifiers. The pseudorandom sampling also speeds up the training process and ensures that each decision model is exposed to different portions of training data and injects a degree of independence to each of the models).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that uses a plurality of inverted machine learning models that attain sets of process parameters for configuring a manufacturing process, each of which are trained using different sets of training data as taught by Cay, with the feature of the hashing system that uses a plurality of homogenous machine learning models which are trained in a bagging ensemble and whose outputs are combined to form a set of outputs as taught by Tristan because it would infer the stated benefit of Tristan, namely improved run time and resource usage because the size of the models are kept from becoming so large that the computing resources become bogged down (Tristan at least [0003], [0006], [0007]).  Furthermore, while Cay uses multiple inverted machine learning models, there is little discussion about what architecture is specifically used by the plurality of machine learning models, thus the use of the same type of model can be a mere design choice of a person familiar with ensemble learning machine learning systems.  In addition, the use of different training data for each model would likewise gain the stated benefit of Tristan in that each model would be independent, thus improving overall predictive accuracy ([0008]). It is well known by persons having ordinary skill in the machine learning arts that different machine learning model architectures have different advantages and disadvantages, thus the choice of a specific model architecture cannot form an inventive concept.  It would be understood by that person having ordinary skill in the art that the benefits of using multiple models trained with different data but sharing an architecture would improve the predictive nature of the modeling system while staying within performance criteria for the computing system executing the models, as noted by Tristan ([0006-0008]).  By combining these elements, it can be considered taking the known feature of using homogenous sets of machine learning models that share the same model architecture and model type but are trained with different data sets as taught by Tristan, and using it to improve the machine learning system that uses a plurality of models trained with different data sets to obtain sets of manufacturing configuration parameters, to achieve the predictable result of using a plurality of homogenous machine learning models, each trained with a different training set and outputs sets of configuration parameters for a manufacturing process that are combined and selected from the combination.
The combination of Cay and Tristan fail to teach “wherein the first inverted machine learning model and the second inverted machine learning model ...are each trained using a different hyperparameter”
Kapoor teaches “wherein the first inverted machine learning model and the second inverted machine learning model ...are each trained using a different hyperparameter” (([page 2] In particular, a method for function-specific robustification of a neural network is provided, comprising the steps: a) Providing the neural network, wherein the neural network is or has been trained on the basis of a training data set comprising training data, b) Generating at least one changed training data set by manipulating the training data set, the training data for this purpose being changed in each case while maintaining semantically meaningful content, c) Changing parameters and / or an architecture of the neural network in Dependence of a comparison between an application of the original Training data set and the at least one changed training data set on the trained neural network, d) Training the modified neural network on the basis of the Training data set and at least part of the at least one changed training data set; [page 7] The training data set 2 and the modified training data set 4 are each applied to the neural network 1, that is, they are each fed to the neural network 1 as input data, the input data being propagated through the neural network 1 as part of a feedforward sequence, so that inferred results can be provided at an output of the neural network 1.[page 10] Changing the parameters of the neural network and / or the architecture or structure of the neural network, in particular, the following methods can be used: ...Changing metaparameters (e.g. hyperparameters of convolution layers and changing activation functions)).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that utilizes a homogenous set of machine learning models that share the same model architecture as taught by Cay and Tristan, with the use of a set of machine learning models that are trained using different hyperparameters because it would gain the stated benefit of Kapoor, a machine learning model that is more robust.  Furthermore, by utilizing machine learning models that have different hyperparameters associated with them but utilize the same architecture, a benefit would be gained of having machine learning models that avoid local minima/maxima because the differently trained models would all have to agree on a given output, rather than being an output of a single model that is taken at face value of being correct.  By combining these elements, it can be considered taking the known system that utilized homogeneous machine learning models to output settings for a manufacturing process as taught by Cay and Tristan, and improve it by allowing the homogeneous machine learning models to be modeled using different hyperparameters in a known way to achieve predictable results.

In regards to Claim 19, Cay, Tristan and Kapoor teach a system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as incorporated by claim 17 above.  
Cay further teaches “The non-transitory machine-readable storage medium of claim 17, wherein the first inverted machine learning model and the second inverted machine learning model each comprise a Feed Forward Neural Network” ([col 26 line 28] In some examples, neural networks can include...feed-forward neural networks; [col 28 line 44] In some examples, the neural network 1200 is a feed-forward neural network; [col 31 line 24] the optimization model can be a hybrid model that employs multiple search algorithms to identify the recommended set of values for the configurable settings. For example, the optimization model can employ a Latin Hypercube Sampling (LHS) algorithm, a Genetic Algorithm (GA), a Generating Set Search (GSS), or any combination of these to effectuate the iterative process. In one particular example, the optimization model can begin with a LHS of the search space to determine possible setting values (values for the configurable settings). From these initial setting values, the GA can begin an iterative process in which it performs crossover operations and random-mutation operations to generate new setting values to try. The crossover operations can use the setting values from promising solutions as parents, such that combinations of these parent values are used to create children for the next iteration. This may help ensure that the optimization model exploits promising regions of the search space. The mutation operations can create random perturbations of the setting values to help ensure exploration of the search space, where the newly created perturbations are evaluated in the next iteration of the optimization model. The iterative process of the GA can continue until the evaluation budget has expired or the solution has stalled and is no longer improving. Within each iteration of the GA, a local pattern search algorithm such as GSS can also be used to refine the best-known solution by generating setting values in the local neighborhood of the best-known solution; wherein each iteration of the model is a different model producing different sets of inputs).

In regards to Claim 22, Cay, Tristan and Kapoor teach a system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as incorporated by claim 17 above.  
Cay further teaches “The non-transitory machine-readable storage medium of claim 17, wherein the operations further comprise: providing for display a plurality of candidate input value sets, wherein each candidate input value set of the plurality of candidate input value sets corresponds to the expected output data for the manufacturing process; receiving a user selection of a candidate input value set of the plurality of candidate input value sets to obtain a selected candidate input value set; and initiating a run of the manufacturing process using the selected candidate input value set” ([col 32 line 18] In block 1314, the processing device transmits an electronic communication indicating the recommended set of values. The electronic communication can be configured to cause the configurable settings to be adjusted to the recommended set of values...As another example, the processing device can transmit the electronic communication over a network to a remote user device (e.g., a laptop computer, mobile phone, or tablet) associated with an operator of the manufacturing process. The user device can receive the electronic communication and responsively output the recommended set of values on a display device to the operator, who may be located on the manufacturing floor or otherwise close to a control panel associated with the manufacturing process. Based on the output, the operator can adjust the configurable settings to the recommended set of values to improve the manufacturing process. As still another example, the electronic communication can be a display signal for generating a graphical user interface on a display device, such as a touch-screen display or a liquid crystal display. The graphical user interface can include the recommended set of values. An operator of the manufacturing process can view the graphical user interface on the display device and tune the configurable settings to the recommended set of values, to improve the manufacturing process).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Cay, Tristan and Kapoor as applied to claim 19 above, and further in view of Ma et al. (Constructive Feedforward Neural Networks Using Hermite Polynomial Activation Functions).

In regards to Claim 20, Cay, Tristan and Kapoor teach a system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process as incorporated by claim 19 above.
Cay further teaches “The non-transitory machine-readable storage medium of claim 19, wherein the Feed Forward Neural Network comprise an output layer and a plurality of hidden layers to model the manufacturing process...” ([col 26 line 28] In some examples, neural networks can include...feed-forward neural networks; [col 28 line 44] In some examples, the neural network 1200 is a feed-forward neural network; [col 28 line 60]  the neural network 1200 operates by receiving a vector of numbers from one layer; transforming the vector of numbers into a new vector of numbers using a matrix of numeric weights, a nonlinearity, or both; and providing the new vector of numbers to a subsequent layer of the neural network 1200...The neural network 1200 can transform the weighted vector using a nonlinearity, such as a sigmoid tangent or the hyperbolic tangent. In some examples, the nonlinearity can include a rectified linear unit, which can be expressed using the following equation: y=max(x,0) where y is the output and x is an input value from the weighted vector. The transformed output can be supplied to a subsequent layer, such as the hidden layer 1204).
Cay and Kapoor fail to teach ““...and wherein the plurality of hidden layers comprises a polynomial function and the output layer comprises a linear activation function”.
Ma teaches “...and wherein the plurality of hidden layers comprises a polynomial function and the output layer comprises a linear activation function” ([page 822 col 1] In this paper, an incremental adaptive constructive structure of a FNN [25], [26], [31] is considered. OHL-FNNs with both linear and nonlinear output layers are utilized here...During the construction process in our proposed scheme, the hidden units are added to the active network one at a time, and the activation function of the hidden units are assigned successively from the lowest order orthonormal Hermite polynomial to the higher order ones; wherein FNN is feedforward neural network).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system using multiple machine learning models to determine a combined set of input parameters for a manufacturing process including feedforward neural networks with hidden and output layers as taught by Cay, Tristan and Kapoor, with the use of polynomial functions in the hidden layer and a linear activation function in the output layer as taught by Ma because by incorporating these features the various feedforward neural networks of Cay would gain improved accuracy and determination of machining configuration values.  While Cay does not go into detail as to the specific mathematical structures of their feedforward neural network layers, it can be considered that using different mathematically correlated activation function are known and obvious modifications of what is known to a person having ordinary skill in the art of machine learning neural networks, and can be considered a mere design choice.  By combining these elements, it can be considered taking the known use of multiple feedforward neural networks with hidden and output layers that output configuration values for a manufacturing process, and modifying it by utilizing polynomial activation functions in the hidden layers and linear activation functions in the output in a known way to achieve predictable results.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN M SKRZYCKI whose telephone number is (571)272-0933. The examiner can normally be reached M-F 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth Lo can be reached on (571) 272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/JONATHAN MICHAEL SKRZYCKI/           Examiner, Art Unit 2116                                                                                                                                                                                             /KENNETH M LO/Supervisory Patent Examiner, Art Unit 2116