DETAILED ACTION
This Action is a response to the filing received 27 June 2019. Claims 1-23 were originally presented for examination. Also on 27 June 2019, a Preliminary Amendment was received, wherein claims 18-23 were canceled and 24-29 were newly added. On 15 April 2022, a second Preliminary Amendment was received, wherein claims 2-4, 8-9 and 12-15 were amended. Claims 1-17 and 24-29 remain pending for examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 27 June 2019, 8 November 2019, 22 October 2020, and 16 November 2021 are being considered by the examiner.

Claim Objections
Claims 1-17 and 23-29 are objected to because of the following informalities: In the independent claims, the ordering of operations and dependency between results of one operation and a previous or subsequent operation is unclear. For example, it is unclear whether the transformations to the partitioned sketched data occur prior to or following the training of a particular program synthesis unit. The confusion is further based on the last clause of claim 1, for example, wherein generating sketched baseline data for each individual program synthesis unit follows the training of the diverse sets of the program synthesis units. FIG. 6 in combination with Spec. at 29:21-30:3 show the following order of operations -> obtain sketch data; partition the sketch data; apply transformations to the sketched data; and train the individual program synthesis models using the transformed sets of data. That is, an original data set is received, comprising sketched data. The sketched data is partitioned into groups, creating partitioned sketched data. The partitioned sketched data is transformed, generating transformed partitioned sketch data. The BPS units have the transformed partitioned sketched data applied to generate vivid sketched baseline data.
	Other orders are possible through reading the cited paragraph of the Specification, such that each BPS unit is trained on an original partition of sketch data, and the BPS units are further applied to the transformed sketch data in a subsequent iteration or evolution of the models. However, the fact that m x n BPS units are already conceived of prior to the training, and each of the BPS units have a model that is based on the applied sketched data and the transformation.
Appropriate correction is required to enhance clarity of the operations. The objections to the dependent claims result from their failure to clarify the above-noted deficiencies of the independent claims.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 24-29 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claims do not fall within at least one of the four categories of patent eligible subject matter.
	In particular, claim 24 recites at least one machine-readable medium comprising a plurality of instructions, executed on a computing device, to facilitate the computing device to perform one or more operations. The wording of this portion excludes the computing device from the broadest reasonable interpretation of the claim (i.e., the claim recites instructions that are executable on a computing device, but the claim does not positively recite the computing device as a part of the claim). That leaves the machine-readable medium comprising instructions as the sole structural element.
	The term “machine-readable medium” appears only at 67:4-15 of the Specification. “One or more aspects of at least one embodiment may be implemented by representative code stored on a machine-readable medium which represents and/or defines logic within an integrated circuit … machine-readable medium may include instructions which represent various logic … ‘IP cores,’ are reusable units of logic for an integrated circuit that may be stored on a tangible, machine-readable medium.” The term “medium” appears once more at 67:3335, “The IP core design can be stored for delivery … using non-volatile memory 2940 (e.g., hard disk, flash memory, or any non-volatile storage medium) …”
	The broadest reasonable interpretation “of machine readable media can encompass non-statutory transitory forms of signal transmission, such as a propagating electrical or electromagnetic signal per se” (MPEP § 2106.03(II), citing In re Nuijten, 500 F.3d 1346, 84 USPQ2d 1495 (Fed. Cir. 2017)). Neither the language of the claim nor the description provided in the Specification limit the scope of the term “machine-readable medium” to non-transitory embodiments. The Specification describes machine-readable medium in a non-limiting exemplary sense, and the term “tangible” does not prevent a broadest reasonable interpretation of a propagating signal, as such are tangible during their limited durations. Accordingly, claim 24 is directed to signals per se, and is therefore ineligible. The remaining limitations of dependent claims 25-29 do not limit the broadest reasonable interpretation of machine-readable media to non-transitory embodiments, and they are therefore also ineligible.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 7-9, 12-15 and 24-27 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Morris, II et al., U.S. 2016/0350671 A1 (hereinafter “Morris”).

Regarding claim 1, Morris teaches: An apparatus to perform automated program synthesis (Morris, e.g., ¶2, “systems and methods for predicting operational outcomes of interest …” See also, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) …”) comprising: a memory to store instructions for automated program synthesis (Morris, e.g., ¶93, “predictive system 160 may include … one or more storage memories 210 …” See also, e.g., ¶98, “memory 210 can store program instructions …”); and a compute cluster coupled to the memory (Morris, e.g., ¶55, “predictive system 160 associated therewith can comprise one or more server(s) …” and ¶¶93-94, “predictive system 160 may include one or more processor(s) … Hardware implementations of the processor(s) 200 may be configured to execute computer-executable or machine-executable instructions … one or more processor(s) can include … any combination … can also include a chipset … for controlling communications between one or more processor(s) 200 and one or more of the other components …”), the compute cluster to support the instructions for performing the automated program synthesis including 
partitioning sketched data into partitions (Morris, e.g., ¶57, “feature discovery process … utilizes source data … as well as other sources of data … to identify data features and overall feature data contexts that can enhance the ability of the runtime operations to predict the likelihood of a operational condition of interest … runtime process utilizes the identified feature data contextualizations to generate runtime data sets …” See also, e.g., ¶60, “feature discovery process is an asynchronous operation that continuously examines the unbounded search space of the formatted source data 176 to determine new features that may best represent contexts preceding and correlated with, or causing, an operational outcome of interest …” Examiner’s note: sketch data represents fundamental input values for particular items of data. See, e.g., Spec. at 29:21-23, “… obtaining sketch data (e.g., primitive lines, shapes, objects, images, letters, rods, etc.) …” Further, the terms partitions and groups are used interchangeably, and the Specification provides no specific means of performing the partitioning or grouping), 
training diverse sets of individual program synthesis units each having different capabilities with the partitioned sketched data (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art1 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines2 (which may include Bayesian-influenced implementations)) and 
for each partition applying respective transformations to the partitioned sketched data, and generating sketched baseline data for each individual program synthesis unit (Morris, e.g., ¶60, “aggregator can be a defined transformation (e.g., a mathematical and/or logical relationship) that transforms/configures the formatted source data 176 into a corresponding feature …” See also, e.g., ¶73, “runtime process aggregates 188 the formatted source data 176 based on the features defined by the contextualization 186 to generate an aggregated data set 190 including feature data context values …” Examiner’s note: feature contextualization identifies an input data variable of interest (i.e., a subset or partition of the original formatted source data); these contextualized features are then “aggregated” or transformed into aggregated data to be used as model input. This aggregated / transformed data is the sketched baseline data (i.e., baseline as in input to the model). Further, the sketched baseline data is generated based on the application of the transformations as disclosed by applicants (see Spec. at 29:23-27) for the purpose of increasing a volume of data. See Morris at ¶39, “data can be contextualized from the performance of one or more mathematical operations on the formatted data, such as, for example, by creating a data value that is the square root … multiple contexts in which the operational data can be provided for analysis greatly increases the quantity of parameters to which the plurality of statistical models can be applied …”).

Regarding claim 2, the rejection of claim 1 is incorporated, and Morris further teaches: wherein the program synthesis units comprise Bayesian program synthesis (BPS) units (Morris, e.g., ¶85. “sub-models may comprise any suitable type of model or predictive analytics, such as … naïve Bayes … support vector machines …”).

Regarding claim 3, the rejection of claim 2 is incorporated, and Morris further teaches: wherein each individual BPS unit has a different model based on the sketched data and the transformation (Morris, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” Examiner’s note: as discussed above, a feature is a subset of the collected data (i.e. a group or partition) and the contextualization is the transformation of the feature into a feature or metric based on the original feature data (i.e., a transformed set of feature data)).

Regarding claim 4, the rejection of claim 3 is incorporated, and Morris further teaches: wherein the sketched data is partitioned into n partitions and m transformations are applied to the BPS units to generate m by n sketched baseline data and associated m by n models of the BPS units (Morris, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” See also, e.g., ¶72, “When a feature is found with a score 184 that satisfies the specification f the contextualization 186 for the outcome of interest (e.g., within a defined number of top scoring features … it can then be used to generate an initial contextualization …” Examiner’s note: m and n are unbounded; also, see Morris, e.g., ¶¶68-69, discussing simple statistical transformations such as mean readings that can be applied over multiple data sets. As cited above, the individual sub-models are based on the aggregated / contextualized data (that is any number of partitions n, with any number of transformations m, may be input into individual sub-models n x m in Morris)).

Regarding claim 7, Morris teaches: A method for automated program synthesis comprising: 
obtaining, …, sketched data (Morris, e.g., ¶56, “raw data can be collected and formatted for use as source data which can be contextualized and analyzed …” Examiner’s note: sketch data represents fundamental input values for particular items of data. See, e.g., Spec. at 29:21-23, “… obtaining sketch data (e.g., primitive lines, shapes, objects, images, letters, rods, etc.) …”); 
[obtaining the sketched data] with at least one computing cluster (Morris, e.g., ¶55, “predictive system 160 associated therewith can comprise one or more server(s) …” and ¶¶93-94, “predictive system 160 may include one or more processor(s) … Hardware implementations of the processor(s) 200 may be configured to execute computer-executable or machine-executable instructions … one or more processor(s) can include … any combination … can also include a chipset … for controlling communications between one or more processor(s) 200 and one or more of the other components …”)
partitioning, with the at least one computing cluster, the sketched data into partitions (Morris, e.g., ¶57, “feature discovery process … utilizes source data … as well as other sources of data … to identify data features and overall feature data contexts that can enhance the ability of the runtime operations to predict the likelihood of a operational condition of interest … runtime process utilizes the identified feature data contextualizations to generate runtime data sets …” See also, e.g., ¶60, “feature discovery process is an asynchronous operation that continuously examines the unbounded search space of the formatted source data 176 to determine new features that may best represent contexts preceding and correlated with, or causing, an operational outcome of interest …” Examiner’s note: sketch data represents fundamental input values for particular items of data. See, e.g., Spec. at 29:21-23, “… obtaining sketch data (e.g., primitive lines, shapes, objects, images, letters, rods, etc.) …” Further, the terms partitions and groups are used interchangeably, and the Specification provides no specific means of performing the partitioning or grouping); 
training, with the at least one computing cluster, diverse sets of individual program synthesis units with the partitioned sketched data and … with each individual program synthesis unit having a different model based on the applied sketched data and transformations (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” See also, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art3 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines4 (which may include Bayesian-influenced implementations). Finally, as discussed above, a feature is a subset of the collected data (i.e. a group or partition) and the contextualization is the transformation of the feature into a feature or metric based on the original feature data (i.e., a transformed set of feature data))
for each partition applying respective transformations to increase a volume of data; and generating, with the at least one computing cluster, sketched baseline data (Morris, e.g., ¶60, “aggregator can be a defined transformation (e.g., a mathematical and/or logical relationship) that transforms/configures the formatted source data 176 into a corresponding feature …” See also, e.g., ¶73, “runtime process aggregates 188 the formatted source data 176 based on the features defined by the contextualization 186 to generate an aggregated data set 190 including feature data context values …” Examiner’s note: feature contextualization identifies an input data variable of interest (i.e., a subset or partition of the original formatted source data); these contextualized features are then “aggregated” or transformed into aggregated data to be used as model input. This aggregated / transformed data is the sketched baseline data (i.e., baseline as in input to the model). Further, the sketched baseline data is generated based on the application of the transformations as disclosed by applicants (see Spec. at 29:23-27) for the purpose of increasing a volume of data. See Morris at ¶39, “data can be contextualized from the performance of one or more mathematical operations on the formatted data, such as, for example, by creating a data value that is the square root … multiple contexts in which the operational data can be provided for analysis greatly increases the quantity of parameters to which the plurality of statistical models can be applied …”).

Regarding claim 8, the rejection of claim 7 is incorporated, and Morris further teaches: wherein the program synthesis units comprise Bayesian program synthesis (BPS) units (Morris, e.g., ¶85. “sub-models may comprise any suitable type of model or predictive analytics, such as … naïve Bayes … support vector machines …”).

Regarding claim 9, the rejection of claim 8 is incorporated, and Morris further teaches: wherein the sketched data is partitioned into n partitions and m transformations are applied to the BPS units to generate m by n sketched baseline data and associated m by n models of the BPS units (Morris, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” See also, e.g., ¶72, “When a feature is found with a score 184 that satisfies the specification of the contextualization 186 for the outcome of interest (e.g., within a defined number of top scoring features … it can then be used to generate an initial contextualization …” Examiner’s note: m and n are unbounded; also, see Morris, e.g., ¶¶68-69, discussing simple statistical transformations such as mean readings that can be applied over multiple data sets. As cited above, the individual sub-models are based on the aggregated / contextualized data (that is any number of partitions n, with any number of transformations m, may be input into individual sub-models n x m in Morris)).

Regarding claim 12, Morris teaches: A system comprising: 
a memory to store instructions and data (Morris, e.g., ¶93, “predictive system 160 may include … one or more storage memories 210 …” See also, e.g., ¶98, “memory 210 can store program instructions …”); and a plurality of cores to execute the instructions to perform the automated program synthesis (Morris, e.g., ¶55, “predictive system 160 associated therewith can comprise one or more server(s) …” and ¶¶93-94, “predictive system 160 may include one or more processor(s) … Hardware implementations of the processor(s) 200 may be configured to execute computer-executable or machine-executable instructions … one or more processor(s) can include … any combination … can also include a chipset … for controlling communications between one or more processor(s) 200 and one or more of the other components …”) including 
partitioning sketched data into partitions (Morris, e.g., ¶57, “feature discovery process … utilizes source data … as well as other sources of data … to identify data features and overall feature data contexts that can enhance the ability of the runtime operations to predict the likelihood of a operational condition of interest … runtime process utilizes the identified feature data contextualizations to generate runtime data sets …” See also, e.g., ¶60, “feature discovery process is an asynchronous operation that continuously examines the unbounded search space of the formatted source data 176 to determine new features that may best represent contexts preceding and correlated with, or causing, an operational outcome of interest …” Examiner’s note: sketch data represents fundamental input values for particular items of data. See, e.g., Spec. at 29:21-23, “… obtaining sketch data (e.g., primitive lines, shapes, objects, images, letters, rods, etc.) …” Further, the terms partitions and groups are used interchangeably, and the Specification provides no specific means of performing the partitioning or grouping), 
training diverse sets of individual program synthesis units each having different capabilities with partitioned sketched data (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art5 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines6 (which may include Bayesian-influenced implementations)) and 
applying respective transformations to each partition, generating sketched baseline data for each individual program synthesis unit (Morris, e.g., ¶60, “aggregator can be a defined transformation (e.g., a mathematical and/or logical relationship) that transforms/configures the formatted source data 176 into a corresponding feature …” See also, e.g., ¶73, “runtime process aggregates 188 the formatted source data 176 based on the features defined by the contextualization 186 to generate an aggregated data set 190 including feature data context values …” Examiner’s note: feature contextualization identifies an input data variable of interest (i.e., a subset or partition of the original formatted source data); these contextualized features are then “aggregated” or transformed into aggregated data to be used as model input. This aggregated / transformed data is the sketched baseline data (i.e., baseline as in input to the model). Further, the sketched baseline data is generated based on the application of the transformations as disclosed by applicants (see Spec. at 29:23-27) for the purpose of increasing a volume of data. See Morris at ¶39, “data can be contextualized from the performance of one or more mathematical operations on the formatted data, such as, for example, by creating a data value that is the square root … multiple contexts in which the operational data can be provided for analysis greatly increases the quantity of parameters to which the plurality of statistical models can be applied …”), and 
training a master program synthesis unit through jointly approximating and modeling behaviors of a whole set of each individual program synthesis unit (Morris, e.g., ¶80, “outputs of the trained sub-models can be combined as a prediction output dataset that the super-model subsequently utilizes as an input set to produce the final prediction output … super-model can serve as the supervisory model, that is the ‘judge,’ to allow selection of the sub-model or combination of sub-models that perform best at modeling the outcome of interest, to produce a final consolidated predicted output 196 having substantially the highest utility …” See also, e.g., ¶84, “predictive system 160 may be configured to train the sub-models and the super-models that combine the outputs of the sub-models using various weights and/or parameters for the purposes of predicting any variety of operational outcomes of interest …” Examiner’s note: the modeling behavior can include at least the generation of the predictive output by each sub-model. Approximating can mean the result of generating the output based on the combined and weighted input of the sub-models).

Regarding claim 13, the rejection of claim 12 is incorporated, and Morris further teaches: wherein the program synthesis units comprise Bayesian program synthesis (BPS) units (Morris, e.g., ¶85. “sub-models may comprise any suitable type of model or predictive analytics, such as … naïve Bayes … support vector machines …”).

Regarding claim 14, the rejection of claim 13 is incorporated, and Morris further teaches: wherein each individual BPS unit has a different model based on the sketched data and the transformation (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” See also, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art7 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines8 (which may include Bayesian-influenced implementations). Finally, as discussed above, a feature is a subset of the collected data (i.e. a group or partition) and the contextualization is the transformation of the feature into a feature or metric based on the original feature data (i.e., a transformed set of feature data)).

Regarding claim 15, the rejection of claim 14 is incorporated, and Morris further teaches: wherein the sketched data is partitioned into n partitions and m transformations are applied to the BPS units to generate m by n sketched baseline data and associated m by n models of the BPS units (Morris, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” See also, e.g., ¶72, “When a feature is found with a score 184 that satisfies the specification of the contextualization 186 for the outcome of interest (e.g., within a defined number of top scoring features … it can then be used to generate an initial contextualization …” Examiner’s note: m and n are unbounded; also, see Morris, e.g., ¶¶68-69, discussing simple statistical transformations such as mean readings that can be applied over multiple data sets. As cited above, the individual sub-models are based on the aggregated / contextualized data (that is any number of partitions n, with any number of transformations m, may be input into individual sub-models n x m in Morris)).

Regarding claim 24, Morris teaches: At least one machine-readable medium comprising a plurality of instructions, executed on a computing device (Morris, e.g., ¶93, “predictive system 160 may include … one or more storage memories 210 …” See also, e.g., ¶98, “memory 210 can store program instructions …”), to facilitate the computing device to perform one or more operations comprising: 
partitioning sketched data into partitions (Morris, e.g., ¶57, “feature discovery process … utilizes source data … as well as other sources of data … to identify data features and overall feature data contexts that can enhance the ability of the runtime operations to predict the likelihood of a operational condition of interest … runtime process utilizes the identified feature data contextualizations to generate runtime data sets …” See also, e.g., ¶60, “feature discovery process is an asynchronous operation that continuously examines the unbounded search space of the formatted source data 176 to determine new features that may best represent contexts preceding and correlated with, or causing, an operational outcome of interest …” Examiner’s note: sketch data represents fundamental input values for particular items of data. See, e.g., Spec. at 29:21-23, “… obtaining sketch data (e.g., primitive lines, shapes, objects, images, letters, rods, etc.) …” Further, the terms partitions and groups are used interchangeably, and the Specification provides no specific means of performing the partitioning or grouping); 
training diverse sets of individual program synthesis units each having different capabilities with partitioned sketched data (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art9 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines10 (which may include Bayesian-influenced implementations). Finally, Examiner notes that each unit (sub-model) of Morris has “different capabilities” in that each produces different classifications based on the configured and aggregated input data selected) and 
applying respective transformations to each partition; generating sketched baseline data for each individual program synthesis unit (Morris, e.g., ¶60, “aggregator can be a defined transformation (e.g., a mathematical and/or logical relationship) that transforms/configures the formatted source data 176 into a corresponding feature …” See also, e.g., ¶73, “runtime process aggregates 188 the formatted source data 176 based on the features defined by the contextualization 186 to generate an aggregated data set 190 including feature data context values …” Examiner’s note: feature contextualization identifies an input data variable of interest (i.e., a subset or partition of the original formatted source data); these contextualized features are then “aggregated” or transformed into aggregated data to be used as model input. This aggregated / transformed data is the sketched baseline data (i.e., baseline as in input to the model). Further, the sketched baseline data is generated based on the application of the transformations as disclosed by applicants (see Spec. at 29:23-27) for the purpose of increasing a volume of data. See Morris at ¶39, “data can be contextualized from the performance of one or more mathematical operations on the formatted data, such as, for example, by creating a data value that is the square root … multiple contexts in which the operational data can be provided for analysis greatly increases the quantity of parameters to which the plurality of statistical models can be applied …”); and 
training a master program synthesis unit through jointly approximating and modeling behaviors of a whole set of each individual program synthesis unit (Morris, e.g., ¶80, “outputs of the trained sub-models can be combined as a prediction output dataset that the super-model subsequently utilizes as an input set to produce the final prediction output … super-model can serve as the supervisory model, that is the ‘judge,’ to allow selection of the sub-model or combination of sub-models that perform best at modeling the outcome of interest, to produce a final consolidated predicted output 196 having substantially the highest utility …” See also, e.g., ¶84, “predictive system 160 may be configured to train the sub-models and the super-models that combine the outputs of the sub-models using various weights and/or parameters for the purposes of predicting any variety of operational outcomes of interest …” Examiner’s note: the modeling behavior can include at least the generation of the predictive output by each sub-model. Approximating can mean the result of generating the output based on the combined and weighted input of the sub-models).

Regarding claim 25, the rejection of claim 24 is incorporated, and Morris further teaches: wherein the program synthesis units comprise Bayesian program synthesis (BPS) units (Morris, e.g., ¶85. “sub-models may comprise any suitable type of model or predictive analytics, such as … naïve Bayes … support vector machines …”).

Regarding claim 26, the rejection of claim 25 is incorporated, and Morris further teaches: wherein each individual BPS unit has a different model based on the sketched data and the transformation (Morris, e.g., ¶79, “a broadly stacked ensemble of different strong classifiers (or statistical models) based upon the determined configurations can be used as sub-models … A plurality of varying machine learning models or classifiers can be trained in parallel based on the contextualization 186 for use as sub-models … aggregated data set 190 may be pre-processed such as, e.g., normalization, scaling, missing data imputation, whitening, dimensionality reduction … for use by the machine learning models or classifiers.” See also, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” Examiner’s note: applicants have described program synthesis as, for example, programming by example, programming by demonstration, Bayesian program synthesis (Spec. at 33:1-2) and/or as including mathematical functions, activation functions, pooling functions, or any other function for program synthesis (Spec. at 33:27-30). Given that the term “program synthesis” has been used elsewhere in the art11 to describe a variety of machine learning, classification and estimation algorithms, Examiner interprets the sub-models as consistent with the individual program synthesis units. Further, Morris describes models as including naïve Bayes and support vector machines12 (which may include Bayesian-influenced implementations). Finally, as discussed above, a feature is a subset of the collected data (i.e. a group or partition) and the contextualization is the transformation of the feature into a feature or metric based on the original feature data (i.e., a transformed set of feature data)).

Regarding claim 27, the rejection of claim 26 is incorporated, and Morris further teaches: wherein the sketched data is partitioned into n partitions and m transformations are applied to the BPS units to generate m by n sketched baseline data and associated m by n models of the BPS units (Morris, e.g., ¶73, “contextualization 186 can be used by the runtime process to generate new predictive models 194 based on the features identified in the contextualization …” See also, e.g., ¶72, “When a feature is found with a score 184 that satisfies the specification of the contextualization 186 for the outcome of interest (e.g., within a defined number of top scoring features … it can then be used to generate an initial contextualization …” Examiner’s note: m and n are unbounded; also, see Morris, e.g., ¶¶68-69, discussing simple statistical transformations such as mean readings that can be applied over multiple data sets. As cited above, the individual sub-models are based on the aggregated / contextualized data (that is any number of partitions n, with any number of transformations m, may be input into individual sub-models n x m in Morris)).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 5 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Morris in view of Vatamanu et al., U.S. 2016/0335432 A1 (hereinafter “Vatamanu”).

Regarding claim 5, the rejection of claim 4 is incorporated, but Morris does not more particularly teach that performing the automated program synthesis includes grouping the BPS units in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the BPS units. However, Vatamanu does teach: wherein compute cluster to support the instructions for performing the automated program synthesis including grouping the BPS units in a cascade based framework, processing input received by the cascade base framework to generate predictions based on the training and model of each of the individual BPS units (Vatamanu, e.g., ¶35, “FIG. 4 illustrates a trainer 42 executing on training system 20 and configured to train a cascade of classifiers … The cascade comprises a plurality of classifiers … configured to be used in a specific order. In some embodiments, each classifier of the cascade distinguishes between several distinct groups of objects … Such classifiers may include adaptations of various automated classifiers well-known in the art, e.g., naïve Bayes classifiers …” Examiner’s note: the distinct groups of objects is analogous to the distinct features partitioned and transformed in Morris) for the purpose of utilizing a cascade of one or more classifiers per level in order to optimize classification of a data set including multiple categories or groups of data, and performing the training in an accelerated fashion (Vatamanu, e.g., ¶¶77-82).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that performing the automated program synthesis includes grouping the BPS units in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the BPS units because the disclosure of Vatamanu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that performing the automated program synthesis includes grouping classification units (such as naïve Bayes classifiers) in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the classification units for the purpose of utilizing a cascade of one or more classifiers per level in order to optimize classification of a data set including multiple categories or groups of data, and performing the training in an accelerated fashion (Vatamanu, Id.).

Regarding claim 10, the rejection of claim 9 is incorporated, but Morris does not more particularly teach that performing the automated program synthesis includes grouping the BPS units in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the BPS units. However, Vatamanu does teach: grouping the individual BPS units into a cascade based framework; and applying input into the cascade based framework of individual BPS units to generate predictions based on the training and model of each of the individual BPS units (Vatamanu, e.g., ¶35, “FIG. 4 illustrates a trainer 42 executing on training system 20 and configured to train a cascade of classifiers … The cascade comprises a plurality of classifiers … configured to be used in a specific order. In some embodiments, each classifier of the cascade distinguishes between several distinct groups of objects … Such classifiers may include adaptations of various automated classifiers well-known in the art, e.g., naïve Bayes classifiers …” Examiner’s note: the distinct groups of objects is analogous to the distinct features partitioned and transformed in Morris) for the purpose of utilizing a cascade of one or more classifiers per level in order to optimize classification of a data set including multiple categories or groups of data, and performing the training in an accelerated fashion (Vatamanu, e.g., ¶¶77-82).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that performing the automated program synthesis includes grouping the BPS units in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the BPS units because the disclosure of Vatamanu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that performing the automated program synthesis includes grouping classification units (such as naïve Bayes classifiers) in a cascade based framework, and processing input received by the cascade based framework to generate predictions based on the training and model of each of the classification units for the purpose of utilizing a cascade of one or more classifiers per level in order to optimize classification of a data set including multiple categories or groups of data, and performing the training in an accelerated fashion (Vatamanu, Id.).

Claims 6 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Morris in view of Ranjan, Supranamaya, U.S. 8,682,812 B1 (hereinafter “Ranjan”).

Regarding claim 6, the rejection of claim 4 is incorporated, but Morris does not more particularly teach that performing the automated program synthesis includes grouping the BPS units in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the BPS units. However, Ranjan does teach: wherein the compute cluster to support the instructions for performing the automated program synthesis including grouping the BPS units in a tree based framework, processing input received by the tree base framework to generate predictions based on the training and model of each of the individual BPS units (Ranjan, e.g., 20:61-21:3, “the machine learning algorithm uses a Naïve Bayes Tree, which is a hybrid of decision-tree classifiers and Naïve-Bayes classifiers. In particular, the decision-tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naïve-Bayesian classifiers. By applying the machine learning algorithm using the Naïve Bayes Tree, the function F(X) is based on the decision tree …”) for the purpose of assembling a combinatory classification model that may be dynamically updated in order to keep pace with changing behavior of prediction targets (Ranjan, e.g., Abs. and 4:56-60).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that performing the automated program synthesis includes grouping the BPS units in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the BPS units because the disclosure of Ranjan shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that performing the automated program synthesis includes grouping the classification units (including naïve-Bayesian classifiers) in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the classification units based on the training and model of each of the classification units for the purpose of assembling a combinatory classification model that may be dynamically updated in order to keep pace with changing behavior of prediction targets (Ranjan, Id.).

Regarding claim 11, the rejection of claim 9 is incorporated, but Morris does not more particularly teach that performing the automated program synthesis includes grouping the BPS units in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the BPS units. However, Ranjan does teach: grouping the individual BPS units into a tree based framework; and applying input into the tree based framework of individual BPS units to generate predictions based on the training and model of each of the individual BPS units (Ranjan, e.g., 20:61-21:3, “the machine learning algorithm uses a Naïve Bayes Tree, which is a hybrid of decision-tree classifiers and Naïve-Bayes classifiers. In particular, the decision-tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naïve-Bayesian classifiers. By applying the machine learning algorithm using the Naïve Bayes Tree, the function F(X) is based on the decision tree …”) for the purpose of assembling a combinatory classification model that may be dynamically updated in order to keep pace with changing behavior of prediction targets (Ranjan, e.g., Abs. and 4:56-60).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that performing the automated program synthesis includes grouping the BPS units in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the BPS units because the disclosure of Ranjan shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that performing the automated program synthesis includes grouping the classification units (including naïve-Bayesian classifiers) in a tree based framework, and processing input by the tree based framework to generate predictions based on the training and model of each of the classification units based on the training and model of each of the classification units for the purpose of assembling a combinatory classification model that may be dynamically updated in order to keep pace with changing behavior of prediction targets (Ranjan, Id.).

Claims 16-17 and 28-29 are rejected under 35 U.S.C. 103 as being unpatentable over Morris in view of Liu et al., U.S. 2013/0132311 A1 (hereinafter “Liu”).

Regarding claim 16, the rejection of claim 15 is incorporated, and Morris further teaches: wherein the master program synthesis unit is trained through jointly approximating and modeling behaviors of a whole set of each individual program synthesis unit (Morris, e.g., ¶80, “outputs of the trained sub-models can be combined as a prediction output dataset that the super-model subsequently utilizes as an input set to produce the final prediction output … super-model can serve as the supervisory model, that is the ‘judge,’ to allow selection of the sub-model or combination of sub-models that perform best at modeling the outcome of interest, to produce a final consolidated predicted output 196 having substantially the highest utility …” See also, e.g., ¶86, “sub-models may be tuned … where such tuning can be conducted by applying validation protocols. For example, a number or percentage of false-negatives and/or false-positives … may be tuned to be in a desired range … the sub-models may be combined with weights or with super-model trigger rules, such that the aggregate super-model exhibits a particular desired range of false-positives and/or false-negatives …”).
	Morris does not more particularly teach that the joint training is accomplished by using a minimization algorithm. However, Liu does teach: [the master program synthesis unit is trained through jointly approximating and modeling behaviors of a set of individual units] by utilizing a minimization algorithm (Liu, e.g., ¶78, “In an example of a fusion learner, let the overall data and their fusion residue be organized … With the costs organized in a diagonal matrix … linear, regularized least-square fusion can be applied, and a weighted MMSE [minimized mean-squared error] solution can be solved for, such as a solution that minimizes mean-squared fusion residue”) for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, e.g., ¶¶22-31).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that the joint training is accomplished by using a minimization algorithm because the disclosure of Liu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that the joint training is accomplished by using a minimization algorithm for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, Id.).

Regarding claim 17, the rejection of claim 16 is incorporated, and Liu further teaches: wherein the minimization algorithm comprises at least one of13 a sum of all updating functions of each BPS unit, a minimize average of all updating functions of each BPS unit, a least squares method, and a gradient based method (Liu, e.g., ¶78, “In an example of a fusion learner, let the overall data and their fusion residue be organized … With the costs organized in a diagonal matrix … linear, regularized least-square fusion can be applied, and a weighted MMSE [minimized mean-squared error] solution can be solved for, such as a solution that minimizes mean-squared fusion residue”) for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, e.g., ¶¶22-31).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that the joint training is accomplished by using a minimization algorithm because the disclosure of Liu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that the joint training is accomplished by using a minimization algorithm for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, Id.).

Regarding claim 28, the rejection of claim 27 is incorporated, and Morris further teaches: wherein the master program synthesis unit is trained through jointly approximating and modeling behaviors of a whole set of each individual program synthesis unit (Morris, e.g., ¶80, “outputs of the trained sub-models can be combined as a prediction output dataset that the super-model subsequently utilizes as an input set to produce the final prediction output … super-model can serve as the supervisory model, that is the ‘judge,’ to allow selection of the sub-model or combination of sub-models that perform best at modeling the outcome of interest, to produce a final consolidated predicted output 196 having substantially the highest utility …” See also, e.g., ¶86, “sub-models may be tuned … where such tuning can be conducted by applying validation protocols. For example, a number or percentage of false-negatives and/or false-positives … may be tuned to be in a desired range … the sub-models may be combined with weights or with super-model trigger rules, such that the aggregate super-model exhibits a particular desired range of false-positives and/or false-negatives …”).
	Morris does not more particularly teach that the joint training is accomplished by using a minimization algorithm. However, Liu does teach: [the master program synthesis unit is trained through jointly approximating and modeling behaviors of a set of individual units] by utilizing a minimization algorithm (Liu, e.g., ¶78, “In an example of a fusion learner, let the overall data and their fusion residue be organized … With the costs organized in a diagonal matrix … linear, regularized least-square fusion can be applied, and a weighted MMSE [minimized mean-squared error] solution can be solved for, such as a solution that minimizes mean-squared fusion residue”) for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, e.g., ¶¶22-31).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that the joint training is accomplished by using a minimization algorithm because the disclosure of Liu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that the joint training is accomplished by using a minimization algorithm for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, Id.).

Regarding claim 29, the rejection of claim 28 is incorporated, and Liu further teaches: wherein the minimization algorithm comprises at least one of14 a sum of all updating functions of each BPS unit, a minimize average of all updating functions of each BPS unit, a least squares method, and a gradient based method (Liu, e.g., ¶78, “In an example of a fusion learner, let the overall data and their fusion residue be organized … With the costs organized in a diagonal matrix … linear, regularized least-square fusion can be applied, and a weighted MMSE [minimized mean-squared error] solution can be solved for, such as a solution that minimizes mean-squared fusion residue”) for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, e.g., ¶¶22-31).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the system and method for dynamically updating or adapting combinatory predictive modeling as taught by Morris to provide that the joint training is accomplished by using a minimization algorithm because the disclosure of Liu shows that it was known to those of ordinary skill in the pertinent art to improve a system and method for combining a series of classifiers to provide that the joint training is accomplished by using a minimization algorithm for the purpose of ensuring a best-fitting fusion model performable to execute feature-based, object-based and/or object-based classification and browsing of multimedia data to include video and audio (Liu, Id.).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. In particular:
Langford et al., U.S. 2017/0308789 A1, teaches systems and methods for efficiently training neural networks including training data batching and selection, model striping and parameter exchange, wherein the exchanging includes transmission of gradient values between nodes;
Liang, H., Yan, Y. (2006). Learning Naïve Bayes Tree for Conditional Probability Estimation, teaches an improved method of learning a Naive Bayes Tree, wherein for each of a plurality of attributes, the data set having those attributes is partitioned, a Naive Bayes Classifier is learned for each partition, wherein a conditional log likelihood is evaluated to determine each split of the tree, and wherein each node in the tree comprises naive Bayes classifiers;
Panda et al., "FALCON: Feature Driven Selective Classification for Energy-Efficient Image Recognition," teaches methods for utilizing characteristic feature consensus to decompose a classification problem and construct a tree of classifier nodes with a generic-to-specific transition in the classification hierarchy, wherein nodes may be activated or deactivated depending on the input data, resulting in more efficient training and prediction in a classifier model; and
Rhoads et al., U.S. 8,755,837 B2, teaches systems and methods for processing image data, to include image enhancement, object segmentation and extraction etc., in order to extract features to be used to find images with similar metrics, wherein the similarity search may be based on a plurality of features.
Examiner has identified particular references contained in the prior art of record within the body of this action for the convenience of Applicant. Although the citations made are representative of the teachings in the art and are applied to the specific limitations within the enumerated claims, the teaching of the cited art as a whole is not limited to the cited passages. Other passages and figures may apply. Applicant, in preparing the response, should consider fully the entire reference as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art and/or disclosed by Examiner.
Examiner respectfully requests that, in response to this Office Action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist Examiner in prosecuting the application.
When responding to this Office Action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections. See 37 C.F.R. 1.111(c).
Examiner interviews are available via telephone and video conferencing using a USPTO-supplied web-based collaboration tool. Applicant is encouraged to submit an Automated Interview Request (AIR) which may be done via https://www.uspto.gov/patent/uspto-automated-interview-request-air-form, or may contact Examiner directly via the methods below.
Any inquiry concerning this communication or earlier communication from Examiner should be directed to Andrew M. Lyons, whose telephone number is (571) 270-3529, and whose fax number is (571) 270-4529. The examiner can normally be reached Monday to Friday from 10:00 AM to 6:00 PM EST.            If attempts to reach Examiner by telephone are unsuccessful, Examiner’s supervisor, Wei Zhen, can be reached at (571) 272-3708. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.            Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (in USA or Canada) or (571) 272-1000.
/Andrew M. Lyons/Examiner, Art Unit 2191                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        2 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        3 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        4 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        5 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        6 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        7 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        8 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        9 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        10 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        11 See, e.g., Pacione et al., U.S. 2013/0158368 A1, at ¶238, “Further examples of the … machine learning method that may be used … include … neural network processing … support vector machine prediction … Gaussians, Bayes nets, dynamic Bayesian networks … and algorithmic predictors, e.g., learned by evolutionary computation or other program synthesis tools.” See also, e.g., Kim et al., U.S. 2017/0337740 A1, at ¶95, “For example, the synthesis program of the control unit 117 identifies an eye region using a cascaded classifier for detecting a face and an eye using a feature vector …”
        12 See, e.g., “Support vector machine,” Wikipedia, last retrieved from https://en.wikipedia.org/wiki/Support-vector_machine on 29 September 2022, showing that in 2011it was shown that the SVM admits a Bayesian interpretation through the technique of data augmentation
        13 See, e.g., Metzler et al., U.S. 2021/0125108 A1, at ¶¶64, 66, disclosing training a machine learning model to minimize the sum of adjusted loss of all training examples in the training data, and/or to minimize the average of the adjusted losses for all of the training examples in the training data. See also, e.g., Aggarwal, U.S. 2018/0150757 A1, teaching that it is known to use gradient descent technique for model cost minimization
        14 See, e.g., Metzler et al., U.S. 2021/0125108 A1, at ¶¶64, 66, disclosing training a machine learning model to minimize the sum of adjusted loss of all training examples in the training data, and/or to minimize the average of the adjusted losses for all of the training examples in the training data. See also, e.g., Aggarwal, U.S. 2018/0150757 A1, teaching that it is known to use gradient descent technique for model cost minimization