Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/07/2022 has been entered.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/07/2022 was filed before the mailing date of the first office action. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.	

Status of Claims
This action is in response to the amendments filed 02/07/22. Claims 1-2, 6-7, 10, 14, and 21-26 have been amended, claims 1-2, 6-12, 14, 16-19, and 21-26 are currently pending.
	

Response to Arguments
Applicant’s amendments regarding the 101 rejection have been fully considered and are considered persuasive, therefore the 101 rejection of claims 1-2, 6-12, 14, 17-19, and 21-26 is withdrawn.
Applicant’s amendments and arguments regarding the prior art rejections have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground of rejection is made in view of Holtham et al in view of Ferguson et al. Holtham teaches a method for augmenting training data sets using Monte Carlo methods and Ferguson teaches storing training data sets in a database implemented in memory. The prior rejections have been updated to include the amended limitations and to clarify the reasoning given for the limitations that were not amended.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-2, 6, and 21-26 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claim 1 recites new matter, as the previous claims and the specification do not describe how storing artificial training data values in a database comprises storing values for at least a portion of the plurality of elements of at least one input vector of the one or more input vectors in table columns corresponding to particular elements of the at least one input vector, the table columns being defined in a schema maintained by the database. While paragraph [0064] of the specification recites “Creating the database at 214 (or a altering a previously created database) may include defining multiple fields, multiple tables, or other database objects, and defining interrelationships between the tables, fields, or the other database objects”, this does not disclose how input vectors are stored in columns corresponding to particular elements or defining a schema for the table columns.
Dependent claims 2, 6, and 21-26 are also rejected because they fail to correct the deficiencies of independent claim 1 on which they depend.
Claim 25 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claim 25 recites new matter, as the previous claims and the specification do not describe how to select a domain from a plurality of domains, or that one or more domains may use the same data foundation but a value of at least one parameter for the domain differs between domains of the plurality of domains.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 6, and 21-26 are rejected under 35 U.S.C. 103 as being unpatentable over Holtham et al (US 20180247227 A1, herein Holtham) in view of Ferguson (US 6944616 B2, herein Ferguson), in further view of Goodman (US 7266492 B2, herein Goodman).
Regarding claim 1, Holtham teaches a computing system comprising:
one or more memories; one or more hardware processing units coupled to the one or more memories; and one or more computer readable storage media storing instructions that, when executed, cause the computing system to perform operations (para. [0061] recites one example of a hardware platform 1022 that can be used to implement the disclosed systems and techniques of FIGS. 8 and 9 (i.e. methods of data augmentation) is shown in FIG. 10 and includes a processor 1018 (for example a CPU, GPU, dedicated machine learning processor, a combination of these options, or another suitable processor), a non-volatile storage 1014 and a volatile storage 1016 where the learnt parameters and augmented training data may be stored) comprising:
identifying one or more input vectors for a machine-learning system, respective input vectors of the one or more input vectors comprising a plurality of elements (para. [0047] recites the training process is provided with original training data at block 800. Original training data can include images, videos, audio files or other numerical datasets such as financial data, geoscience data or climate data (i.e. a plurality of elements). Para [0029] recites inputs 200 are first compressed using a compression algorithm (for example MPEG-1 Audio Layer-3 (MP3), JPEG, JPEG 2000, MPEG etc.) using only a few basis vectors to represent the input in a process 202 before the neural network parameters are trained in a process 204 (Examiner’s Note: Holtham teaches multiple embodiments, the first being a method to train a neural network and the second to augment training data that is used to train the neural network from the first embodiment. One of ordinary skill would understand that the inputs from para. [0047] could be represented by the basis vectors from para. [0029]));
retrieving one or more parameters for the training data based on a domain of the machine-learning system (fig. 8 and para. [0048] recite at block 802, the training inputs are input into a parameter estimation module that estimates the parameters of the mathematical model behind the data. If no training data is available, the estimated parameters can be created from prior knowledge of the problem which the machine learning algorithm is trying to learn. For example, domain experts such as doctors and researchers, will have an understanding of the behavior of tumor growth and the expected model parameters. Geophysicists will have a knowledge of the expected geometries and seismic velocities of salt bodies, sediments, and oil reserves. Generally, if you have a real-world phenomenon to analyze, then that would be your training data (i.e. retrieving parameters for the training data based on the intended domain));
retrieving one or more functions for generating the training data corresponding to the one or more input vectors (fig. 8 and para. [0051] recite at block 808, the parameter estimation module combines the models produced at both block 804 using Monte Carlo type simulations to produce a training data simulation model, as well as any at block 806 that are based on domain knowledge. To illustrate, consider the following example. For seismic examples, we have data (block 800) from which the seismic velocity of the subsurface can be estimated. Based on this estimated seismic model, the velocities of the models can be varied based on a probability density function to produce a set of N models with realistic and different seismic velocities and geometries (i.e. using a specific function for generating the training data that corresponds to the input));
accessing one or more data sources to retrieve one or more sets of data for building a data foundation for generating the training data (para. [0050] recites additionally, other information based on domain expert knowledge can be incorporated into the data augmentation pipeline at block 806. Returning to the example of the brain imagery application, it may be known by medical experts that tumor growth rates and elastic parameters vary depending on the region of the brain and brain geometry (i.e. accessing data sources to build a data foundation for generating training data));
selecting a plurality of data foundation values from the data foundation according to one or more parameters to provide selected data foundation values (para. [0051] recites additional models can be generated in block 806 based on additional information not present in the initial training data (in this seismic example, there may be drill holes with measured seismic velocity with depth, or geologic information that could be converted to seismic velocity). This additional information from the drill holes could be used to create an additional set of M models. Block 808 would append the N models generated from the original data, with the M models based on additional information into a new set of P (P2:N+M) models from which data can be simulated in block 810 (i.e. values from the data foundation can be selected according to certain parameters)), wherein the one or more parameters (i) constrain values used as input to one or more functions that generate at least a portion of the plurality of data values; (ii) define a set of one or more values useable as input to the one or more functions; or (iii) at least in part define the operation of a function of the one or more functions (para. [0049] recites once the parameter estimation process has been performed, at block 104 the parameter estimation module can perform Monte Carlo type model parameter generation. In other examples, other probabilistic methods (e.g., Gaussian random processes) can be used in addition to or instead of Monte Carlo methods. In this step, a set of possible model parameters are populated using a probability distribution for all the variables that have inherent uncertainty. The set of models is then generated by sampling the probability functions (i.e. the parameters can define a set of values that are usable as input to an intended function));
at least in part using an instance of an abstract data type stored in the one or more memories (para. [0047] recites original training data can include images, videos, audio files or other numerical datasets such as financial data, geoscience data or climate data. Block 800 is depicted with a cross-sectional image of a brain scan, for example a CT scan or magnetic resonance imaging (MRI) scan, however it will be appreciated that the disclosed training data augmentation can be used with a variety of different types of data. In some examples, original training data may be a limited data set, an unbalanced data set, or an empty data set that may benefit from augmentation with simulated data as described herein (i.e. the system uses an abstract data type. Examiner’s Note: block 1016 of fig. 10 shows storage for models and data, which would include the abstract data types from para. [0047])), creating artificial training data values corresponding to the one or more input vectors by using the selected data foundation values as input to the one or more functions, wherein, for a given artificial training data value, the artificial training data value is different than the one or more values used as input to the one or more functions that produced the artificial training data value (para. [0052] recites at block 810, the combined model is used to simulate training data that comports with the features defined by the training data simulation model (i.e. the data foundation from block 808 is used as input to generate artificial data in block 810). Fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. the one or more functions). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. Block 906 discretizes the modelling domain (such as the earth or brain) onto a mesh (regular rectangular mesh, polygonal mesh, tetrahedral mesh, etc.) upon which the numerical simulations will be performed. Block 908 populates the cells in the discretized meshes based on the models generated from the output of block 808. Block 910 solves the numerical modelling equations using solvers such as direct linear solvers or sparse matrix solvers. Block 912 generates the augmented images or videos etc. based on the computed numerical solutions from block 910 (i.e. using the one or more functions to generate artificial data that is different than the input data from the data foundation)) 
and at least one function of the one or more functions comprises (1) one or more mathematical operations applied to the one or more elements of the at least one input vector of the one or more input vectors to provide a result that is different than the one or more elements; or (2) a mathematical selection or distribution function (fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. the one or more functions). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. (i.e. the at least one function comprises a mathematical operation));
generating a plurality of vectors, implemented at least in part in the one or more memories, comprising artificial training data values, the vectors being implemented as data structures in the memory of the computing system (para. [0055] recites after the unrealistic data examples have been removed at block 812, the final augmented dataset (represented
by the identified realistic or true examples of training data) is stored and can be used for subsequent machine learning applications (Examiner’s Note: as shown by the citation of para. [0029] above, the second embodiment relied upon here would be understood by one of ordinary skill in the art to use vectors to represent both input and output data). Fig 10 and para. [0061] recite the final augmented data set is shown in module 1020, which can be a hardware data storage device that stores the augmented training data (i.e. the artificial data values are stored in the memory of the computing system)); 
and training the machine-learning system with the plurality of vectors by processing the plurality of vectors using the machine learning system to provide a trained machine learning model (fig. 2 shows the steps of training a machine learning system, as noted above, one of ordinary skill would understand that the machine learning system can be trained with the augmented data sets created by the method shown in fig. 8), submitting input to the trained machine learning model; and by the trained machine learning model, generating a result at least in part using the input (fig. 6 shows the steps of using a previously trained machine learning system and generating a predicted result, as noted above, one of ordinary skill would understand that the trained machine learning system can be utilized with the augmented data sets created by the method shown in fig. 8. Para. [0030] recites the network parameters are updated during training using the previous iteration parameters as a starting point. If the current inputs are at the required or desired compression (decision point 212), the obtained optimized network parameters the final parameters for the neural are network 214 (i.e. using the model is an iterative process and can generate a result)).
Holtham does not explicitly teach determining a database for storing training data; and storing the generated training data in the database.
Ferguson teaches determining a database for storing training data (col. 24 lines 6-9 recite specifically, as shown by step or module 206, training input data may be stored with associated timestamps in the historical database 1210 (i.e. a database for storing training data)); 
and storing the generated training data in the database (col. 24 lines 4-6 recite as shown in fig. 6, in addition to the sampling and storing of input data at specified input data storage intervals, training input data 1306 may also be stored with associated timestamps in the historical database 1210 (i.e. the generated training data is stored in the database)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the user interface and database storage from Ferguson with the training data augmentation method from Holtham. Ferguson and Holtham are both systems to training data for machine-learning, but while Holtham teaches storage for training data is does not explicitly teach that the training data is stored in a database.  One of ordinary skill in the art would benefit from the explicitly defined features in Ferguson, as the user interface would make it clear how to input parameters and functions to shape the training data; in addition, the database would make it clear to one of ordinary skill where the training data could be accessed if needed.
The combination of Holtham and Ferguson does not teach wherein storing training data values in a database comprises storing values for at least a portion of the plurality of elements of at least one input vector of the one or more input vectors in table columns corresponding to particular elements of the at least one input vector, the table columns being defined in a schema maintained by the database.
Goodman teaches wherein storing data values in a database comprises storing values for at least a portion of the plurality of elements of at least one input vector of the one or more input vectors in table columns corresponding to particular elements of the at least one input vector, the table columns being defined in a schema maintained by the database (figs. 2 and 3 both teach data structures wherein input data is stored in table columns. Col. 6 line 67 – col. 7 line 3 recite the data structure 200 includes Y number of rows and I number of columns, where Y is the number of output classes (values for y), I is the number of training instances (i.e. the table columns correspond to elements of the input training instances and are organized by the database)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the iterative training method from Goodman to train the machine-learning system from Holtham (as modified by Ferguson). Goodman and Holtham (as modified by Ferguson) are both directed to methods of facilitating machine learning training, and while Ferguson teaches storing training data in a database, neither Holtham nor Ferguson teach a specific arrangement of storing training data in table columns. One of ordinary skill would benefit from using the arrangement of training data from Goodman to allow a user to more specifically organize training data, which would improve the user’s ability to keep track of which data to use and improve performance of the model.
Regarding claim 2, the combination of Holtham, Ferguson, and Goodman teaches the computing system according to claim 1, wherein determining the database comprises analyzing the one or more input vectors to determine data definitions for the one or more input vectors (Ferguson col. 29 lines 14-16 recite that once the data system has been specified, the user may specify the data type using step or module 3204: specify data type (i.e. the data definitions)) and generating a database for storing data for the one or more input vectors based on the determined data definitions (Ferguson col. 29 lines 16-17 recite that the data type may indicate which of the many types of data and/or storage modes is desired (i.e. generating a database based on the data definitions)).
Regarding claim 6, the combination of Holtham, Ferguson, and Goodman teaches the computing system according to claim 1, wherein the creating the artificial training data values comprises using one or more statistical models (Holtham fig. 8 and para. [0052] recite at block 810, the combined model (Examiner’s Note: the combined model is created during step 808 from the models in steps 804 and 806) is used to simulate training data that comports with the features defined by the training data simulation model. For example, for the brain imagery example, using the set of different brain geometries and growth rates, tumors of varying sizes and geometries can be mathematically modelled in different regions of the brain to produce a comprehensive set of possible brain images. Because the simulated data is based on the training data simulation model, which represents both the estimated parameters of the original training data as well as any problem-specific constraints leveraged from domain knowledge, the simulated data can be realistic in nature and thus usable for training a machine learning model to estimate or classify the parameters of actual training data of a similar nature (i.e. the artificial data is created based on a mathematical model)).
Regarding claim 16, the combination of Holtham, Ferguson, and Goodman teaches the method according to claim 14, further comprising: in response to training the machine-learning system, evaluating the machine-learning system; and based on the results of the evaluation of the machine-learning system, creating additional one or more sets of artificial values and iteratively training the machine-learning system with the additional one or more sets of artificial values (Goodman col. 2, lines 36-42 recite that maximum entropy models are conventionally learned using generalized iterative scaling (GIS). At each iteration, a step is taken in a direction that increases the likelihood of the training data (i.e. the machine-learning system is trained iteratively). The step size is determined to be not too large and not too small (i.e. the machine-learning system is evaluated between iterations): the likelihood of the training data increases at each iteration and eventually converges to the global optimum).
Regarding claim 18, the Holtham, Ferguson, and Goodman teaches the method according to claim 14, wherein the values of the set of values are generated evenly across a range of possible values (Goodman col. 2, lines 33-35 recite that maxent models are as close as possible to the uniform distribution (i.e. values are generated evenly), subject to constraint satisfaction).
Regarding claim 21, the combination of Holtham, Ferguson, and Goodman teaches the computing system according to claim 1, wherein an expected output for a vector is provided with the vector to the machine learning system (Ferguson col. 19 lines 63-67 – col. 20 lines 1-2 recite during training, the support vector machine 1206 may use its input data 1220 to produce predicted output data 1218. These predicted output data values 1218 may be used in combination with training input data 1306 to produce error data. These error data values may then be used to adjust the coefficients of the support vector machine (i.e. the expected output for a vector is provided with the vectors to the machine learning system)).
Regarding claim 22, the combination of Holtham, Ferguson, and Goodman teaches the computing system of claim 1, the operations further comprising: analyzing the data foundation to determine at least one parameter of the one or more parameters (Holtham fig. 8 and para. [0047] recite fig. 8 depicts a flowchart of steps for simulating training data in a machine learning system as described herein. To begin, the training process is provided with original training data at block 800. Original training data can include images, videos, audio files or other numerical datasets such as financial data, geoscience data or climate data. Block 800 is depicted with a cross-sectional image of a brain scan, for example a CT scan or magnetic resonance imaging (MRI) scan, however it will be appreciated that the disclosed training data augmentation can be used with a variety of different types of data (i.e. the data foundation). Para. [0048] recites at block 802, the training inputs are input into a parameter estimation module that estimates the parameters of the mathematical model behind the data. If no training data is available, the estimated parameters can be created from prior knowledge of the problem which the machine learning algorithm is trying to learn. For example, domain experts such as doctors and researchers, will have an understanding of the behavior of tumor growth and the expected model parameters. Geophysicists will have a knowledge of the expected geometries and seismic velocities of salt bodies, sediments, and oil reserves. Generally, if you have a real-world phenomenon to analyze, then that would be your training data. If the system has access to a simulation available of a real world phenomenon (for example CFD simulator), that could be used to generate training data with the understanding that the machine learning model would only learn as accurately as the simulator. The parameter estimation module can estimate the parameters by solving an inverse problem or other parameter estimation technique (i.e. determining parameters of the data foundation)).
Regarding claim 23, the combination of Holtham, Ferguson, and Goodman teaches the computing system of claim 22, wherein analyzing the data foundation comprises determining a distribution of data in the data foundation (Holtham fig. 8 and para. [0049] recites once the parameter estimation process has been performed, at block 804 the parameter estimation module can perform Monte Carlo type model parameter generation. In other examples, other probabilistic methods (e.g., Gaussian random processes) can be used in addition to or instead of Monte Carlo methods. In this step, a set of possible model parameters are populated using a probability distribution for all the variables that have inherent uncertainty. The set of models is then generated by sampling the probability functions (i.e. determining a distribution of data in the data foundation)).
Regarding claim 24, the combination of Holtham, Ferguson, and Goodman teaches the computing system of claim 1, wherein the one or more parameters comprise a statistical model or distribution (Ferguson col. 44 lines 15-19 recite in the process control domain, it may be desirable or productive to combine the functions of a support vector machine with other more standard control functions such as statistical tests, feedback control, etc. (i.e. a parameter can be a statistical model)).
Regarding claim 25, the combination of Holtham, Ferguson, and Goodman teaches the computing system of claim 1, wherein the domain is selected from a plurality of domains, domains of the plurality of domains using the same data foundation but wherein a value of at least one parameter for the domain differs between domains of the plurality of domains (Holtham fig. 8 and para. [0048] recite at block 802, the training inputs are input into a parameter estimation module that estimates the parameters of the mathematical model behind the data. If no training data is available, the estimated parameters can be created from prior knowledge of the problem which the machine learning algorithm is trying to learn. For example, domain experts such as doctors and researchers, will have an understanding of the behavior of tumor growth and the expected model parameters. Para. [0048] also recites for example, for machine learning predictions relating to brain images (MRI, CT scan etc.), parameters of the image data which the machine learning model may be trained to estimate or classify are brain size, brain geometry, tumor geometry, tumor growth rates, brain elasticity, and the like. Para. [0050] recites This can produce a large sample of realistic model parameters, for example brain geometries and tumor growth rates in the context of training data including brain images. Additionally, other information based on domain expert knowledge can be incorporated into the data augmentation pipeline at block 806. Returning to the example of the brain imagery application, it may be known by medical experts that tumor growth rates and elastic parameters vary depending on the region of the brain and brain geometry (Examiner’s Note: brain size and tumor growth rates are considered examples of related domains that would draw upon the same data foundation (i.e. brain images) but would have at least one different parameter. One of ordinary skill would understand how to select this data foundation as opposed to the other domains (i.e. the geophysics example from para. [0048]) that might be stored in the system)).
Regarding claim 26, the combination of Holtham, Ferguson, and Goodman teaches the computing system of claim 1, wherein the generating a plurality of vectors comprises randomly selecting an artificial training data value for each element of a given input vector of the one or more input vectors (Ferguson col. 19 lines 61-65 recite to train the support vector machine, the newly configured support vector machine is usually initialized by assigning random values to all of its coefficients. During training, the support vector machine 1206 may use its input data 1220 to produce predicted output data 1218 (i.e. randomly selecting artificial training values)).
	
Claims 7-9, 12, 14, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Holtham et al (US 20180247227A1, herein Holtham) in view of Ferguson (US 6944616 B2, herein Ferguson).
Regarding claim 7, Holtham teaches one or more non-transitory computer-readable storage media comprising:
computer-executable instructions that, when executed by a computing system comprising at least one hardware processor and at least one memory coupled to the at least one hardware processor (para. [0045] recites all of the tasks and steps described herein may be embodied in, and fully automated by, executable program instructions executed by a computing system comprising computing hardware that performs one or more computing tasks), cause the computing system to receive an input vector definition for a target machine-learning system, the input vector definition comprising a definition of a plurality vector elements (para. [0047] recites the training process is provided with original training data at block 800. Original training data can include images, videos, audio files or other numerical datasets such as financial data, geoscience data or climate data (i.e. a plurality of elements). Para [0029] recites inputs 200 are first compressed using a compression algorithm (for example MPEG-1 Audio Layer-3 (MP3), JPEG, JPEG 2000, MPEG etc.) using only a few basis vectors to represent the input in a process 202 before the neural network parameters are trained in a process 204 (Examiner’s Note: Holtham teaches multiple embodiments, the first being a method to train a neural network and the second to augment training data that is used to train the neural network from the first embodiment. One of ordinary skill would understand that the inputs from para. [0047] could be represented by the basis vectors from para. [0029]));
computer-executable instructions that, when executed by a computing system, cause the computing system to determine one or more parameters for creating artificial training values for the input vector (fig. 8 and para. [0048] recite at block 802, the training inputs are input into a parameter estimation module that estimates the parameters of the mathematical model behind the data. If no training data is available, the estimated parameters can be created from prior knowledge of the problem which the machine learning algorithm is trying to learn. For example, domain experts such as doctors and researchers, will have an understanding of the behavior of tumor growth and the expected model parameters. Geophysicists will have a knowledge of the expected geometries and seismic velocities of salt bodies, sediments, and oil reserves. Generally, if you have a real-world phenomenon to analyze, then that would be your training data (i.e. determining parameters for the training data based on the intended domain));
computer-executable instructions that, when executed by a computing system, cause the computing system to create a plurality of artificial training values for the plurality of input vectors using one or more functions (para. [0052] recites at block 810, the combined model is used to simulate training data that comports with the features defined by the training data simulation model (i.e. the data foundation from block 808 is used as input to generate artificial data in block 810). Fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. the one or more functions). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. Block 906 discretizes the modelling domain (such as the earth or brain) onto a mesh (regular rectangular mesh, polygonal mesh, tetrahedral mesh, etc.) upon which the numerical simulations will be performed. Block 908 populates the cells in the discretized meshes based on the models generated from the output of block 808. Block 910 solves the numerical modelling equations using solvers such as direct linear solvers or sparse matrix solvers. Block 912 generates the augmented images or videos etc. based on the computed numerical solutions from block 910 (i.e. using the one or more functions to generate artificial data that is different than the input data from the data foundation)), at least one function of the one or more functions comprising (1) one or more mathematical operations applied to the one or more elements the input vector to provide a result that is different than the one or more elements; or (2) a mathematical selection or distribution function (fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. the one or more functions). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. (i.e. the at least one function comprises a mathematical operation));
computer-executable instructions that, when executed by a computing system, cause the computing system to generate a plurality of vectors comprising artificial training values, the vectors being implemented in the memory of the computing system (para. [0055] recites after the unrealistic data examples have been removed at block 812, the final augmented dataset (represented by the identified realistic or true examples of training data) is stored and can be used for subsequent machine learning applications (Examiner’s Note: as shown by the citation of para. [0029] above, the second embodiment relied upon here would be understood by one of ordinary skill in the art to use vectors to represent both input and output data). Fig 10 and para. [0061] recite the final augmented data set is shown in module 1020, which can be a hardware data storage device that stores the augmented training data (i.e. the artificial data values are stored in the memory of the computing system));
and training the target machine-learning system with the plurality of vectors to provide a trained machine learning model (fig. 2 shows the steps of training a machine learning system, as noted above, one of ordinary skill would understand that the machine learning system can be trained with the augmented data sets created by the method shown in fig. 8); submitting input to the trained machine learning model; and by the trained machine learning model, generating a result at least in part using the input (fig. 6 shows the steps of using a previously trained machine learning system and generating a predicted result, as noted above, one of ordinary skill would understand that the trained machine learning system can be utilized with the augmented data sets created by the method shown in fig. 8. Para. [0030] recites the network parameters are updated during training using the previous iteration parameters as a starting point. If the current inputs are at the required or desired compression (decision point 212), the obtained optimized network parameters the final parameters for the neural are network 214 (i.e. using the model is an iterative process and can generate a result)).
However, Holtham does not explicitly teach computer-executable instructions that, when executed by a computing system, cause the computing system to store the training value in a training data database.
Ferguson teaches computer-executable instructions that, when executed by a computing system, cause the computing system to storing the training value in a training data database (col. 11 lines 64-67 recite the term "computer system" can be broadly defined to encompass any device having at least one processor that executes instructions from a memory medium (i.e. computer executable instructions). Col. 24 lines 4-6 recite as shown in FIG. 6, in addition to the sampling and storing of input data at specified input data storage intervals, training input data 1306 may also be stored with associated timestamps in the historical database 1210 (i.e. the generated training data is stored in the database)).
See claim 1 for motivation to combine.
	
Regarding claim 8, the combination of Holtham and Ferguson teaches the non-transitory computer-readable storage media according to claim 7, wherein receiving an input vector definition comprises analyzing the target machine-learning system to identify an input vector argument (Ferguson col. 29 lines 20-24 recite that the user may specify a data item number or identifier (i.e. an input vector argument) using step or module 3206. The data item number or identifier may indicate which of the many instances of the specify data type in the specified data system is desired).
Regarding claim 9, the combination of Holtham and Ferguson teaches the non-transitory computer-readable storage media according to claim 7, wherein determining one or more parameters comprises analyzing the input vector definition to determine a type of the input vector (Ferguson col. 29 lines 14-17 recite that the user may specify the data type using step or module 3204: specify data type. The data type may indicate which of the many types of data and/or storage modes is desired (i.e. the type of input vector)).
Regarding claim 12, the combination of Holtham and Ferguson teaches the non-transitory computer-readable storage media according to claim 7, wherein creating the artificial training values further comprises generating an expected output value for the artificial training values (Ferguson col. 19 lines 63-65 recites during training, the support vector machine 1206 may use its input data 1220 to produce predicted output data 1218 (i.e. an expected output value)); 
and wherein storing the training value includes storing the expected output value in the training data database. (Ferguson col. 21 lines 5-7 recite the (predicted) output data value 1218 produced by the support vector machine may be stored in the historical database (i.e. the expected output value is stored in the training data database)).
Regarding claim 14, Holtham teaches a method implemented in a computer system comprising a memory and at least one hardware processor coupled to the memory (para. [0023] recites various inventive systems and methods (generally "features") that improve the operation of computer-implemented neural networks will now be described with reference to the specific embodiments shown in the drawings. Para. [0023] also recites features for augmenting training data sets will then be described with reference to FIGS. 8-10. Beneficially, these features can reduce the amount of real-world training data required to train a machine learning model to achieve a desired level of accuracy), comprising:
determining a set of one or more input vectors for the machine-learning system (para. [0047] recites the training process is provided with original training data at block 800. Original training data can include images, videos, audio files or other numerical datasets such as financial data, geoscience data or climate data (i.e. a plurality of elements). Para [0029] recites inputs 200 are first compressed using a compression algorithm (for example MPEG-1 Audio Layer-3 (MP3), JPEG, JPEG 2000, MPEG etc.) using only a few basis vectors to represent the input in a process 202 before the neural network parameters are trained in a process 204 (Examiner’s Note: Holtham teaches multiple embodiments, the first being a method to train a neural network and the second to augment training data that is used to train the neural network from the first embodiment. One of ordinary skill would understand that the inputs from para. [0047] could be represented by the basis vectors from para. [0029]));
retrieving one or more parameters for at least one input vector of the set of input vectors for creating artificial values for the respective vectors (fig. 8 and para. [0048] recite at block 802, the training inputs are input into a parameter estimation module that estimates the parameters of the mathematical model behind the data. If no training data is available, the estimated parameters can be created from prior knowledge of the problem which the machine learning algorithm is trying to learn. For example, domain experts such as doctors and researchers, will have an understanding of the behavior of tumor growth and the expected model parameters. Geophysicists will have a knowledge of the expected geometries and seismic velocities of salt bodies, sediments, and oil reserves. Generally, if you have a real-world phenomenon to analyze, then that would be your training data (i.e. retrieving parameters for the training data based on the intended domain));
creating a set of artificial values for the set of input vectors, the creating comprising executing at least one method of the one or more methods based on the one or more parameters and the values (para. [0052] recites at block 810, the combined model is used to simulate training data that comports with the features defined by the training data simulation model (i.e. the data foundation from block 808 is used as input to generate artificial data in block 810). Fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. one or more methods). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. Block 906 discretizes the modelling domain (such as the earth or brain) onto a mesh (regular rectangular mesh, polygonal mesh, tetrahedral mesh, etc.) upon which the numerical simulations will be performed. Block 908 populates the cells in the discretized meshes based on the models generated from the output of block 808. Block 910 solves the numerical modelling equations using solvers such as direct linear solvers or sparse matrix solvers. Block 912 generates the augmented images or videos etc. based on the computed numerical solutions from block 910 (i.e. using one or more methods to generate artificial data that is different than the input data from the data foundation)), the at least one method comprising (1) one or more mathematical operations applied to the one or more elements of the at least one input vector to provide a result that is different than the one or more elements; or (2) a mathematical selection or distribution operation (fig. 9 and para. [0060] recite further details of an embodiment of block 810 of FIG. 8 are shown in FIG. 9. Block 900 involves defining the appropriate modelling equations based on the machine learning problem of interest. Using the seismic example, the relevant equations may be the elastic or inelastic wave equation (i.e. the at least one method). Block 902 defines the parameters relevant to the simulations such as source and receiver positions, noise parameters and sampling rates etc. For the MRI example, this may include among others, imaging parameters, equipment specifications and geometry. Block 904 defines the numerical simulation technique such as finite volume, finite element, or finite volume etc. (i.e. the at least one method comprises a mathematical operation));
generating a plurality of vectors comprising artificial values, the vectors being implemented in the memory of the computing system (para. [0055] recites after the unrealistic data examples have been removed at block 812, the final augmented dataset (represented by the identified realistic or true examples of training data) is stored and can be used for subsequent machine learning applications (Examiner’s Note: as shown by the citation of para. [0029] above, the second embodiment relied upon here would be understood by one of ordinary skill in the art to use vectors to represent both input and output data). Fig 10 and para. [0061] recite the final augmented data set is shown in module 1020, which can be a hardware data storage device that stores the augmented training data (i.e. the artificial data values are stored in the memory of the computing system));
and training the machine-learning system with the plurality of vectors by processing the plurality of vectors using the machine learning system to provide a trained machine learning model (fig. 2 shows the steps of training a machine learning system, as noted above, one of ordinary skill would understand that the machine learning system can be trained with the augmented data sets created by the method shown in fig. 8); submitting input to the trained machine learning model; and by the trained machine learning model, generating a result at least in part using the input (fig. 6 shows the steps of using a previously trained machine learning system and generating a predicted result, as noted above, one of ordinary skill would understand that the trained machine learning system can be utilized with the augmented data sets created by the method shown in fig. 8. Para. [0030] recites the network parameters are updated during training using the previous iteration parameters as a starting point. If the current inputs are at the required or desired compression (decision point 212), the obtained optimized network parameters the final parameters for the neural are network 214 (i.e. using the model is an iterative process and can generate a result)).
However, Holtham does not explicitly teach identifying one or more methods of generating values associated with the respective input vector.
Ferguson teaches identifying one or more methods of creating the artificial values associated with the respective input vector (col. 49 fig. 28 and lines 8-21 recite the first template 2600 in this set of two templates is shown. First template 2600 may specify general characteristics of how the support vector machine 1206 may operate. The portion of the screen within a box labeled 2620, for example, may show how timing options may be specified for the support vector machine module 1206. As previously described, more than one timing option may be provided. A training timing option may be provided, as shown under the label "train" in box 2620. Similarly, a prediction timing control specification (i.e. a method of generating values) may also be provided, as shown under the label "run" in box 2620. The timing methods may be chosen from a pop-up menu of various timing methods that may be implemented in one embodiment).
See claim 1 for motivation to combine.

Regarding claim 17, the combination of Holtham and Ferguson teaches the method according to claim 14, wherein the values of the set of values are generated randomly across a range of possible values (Ferguson col. 19 lines 61-65 recite to train the support vector machine, the newly configured support vector machine is usually initialized by assigning random values to all of its coefficients (i.e. a set of values is generated randomly across a range of possible values). During training, the support vector machine 1206 may use its input data 1220 to produce predicted output data 1218).

Claims 10-11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Holtham et al (US 20180247227A1, herein Holtham) in view of Ferguson (US 6944616 B2, herein Ferguson), in further view of Lin et al (US 8370280 B1, herein Lin).
Regarding claim 10, the combination of Holtham and Ferguson teaches non-transitory computer-readable storage media of claim 7.
However, the combination of Holtham and Ferguson does not explicitly teach computer-executable instructions that, when executed by a computing system, cause the computing system to associate a scoring function with the artificial training values; and wherein the training the target machine-learning system further comprises executing the associated scoring function with output from the machine-learning system when executed with the artificial training data values.
Lin teaches computer-executable instructions that, when executed by a computing system, cause the computing system to associate a scoring function with the generated training value (Col 10 lines 47-52 recite embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on computer storage medium for execution by, or to control the operation of, data processing apparatus (i.e. computer executable instructions which cause a computing system to implement the limitations of claim 10). Col. 8 lines 44-61 recite that a performance indicator can be considered any suitable quantitative measure (e.g., a metric), qualitative designation (e.g., labels such as "highly accurate", "robust", etc.) or ranking which describes the performance of a predictive model. In some implementations, performance indicators can include: accuracy metrics (e.g. predictive error percentages, confidence scores (i.e. a scoring function), etc.) that reflect the tendency of a predictive model to output correct or erroneous predictive outcomes, stability metrics (e.g., runtime error percentages) that reflect the tendency of a predictive model to successfully reach a prediction, flexibility metrics (e.g. number of parameters, highest degree of variable terms, etc.) that reflect the ability of a predictive model to glean complicated patterns from training data having several features per example, complexity metrics ( e.g., average number of required computations) that reflect the computational effort required to execute a predictive model, and other comparable metrics applicable to statistical models); 
and wherein the training the target machine-learning system further comprises executing the associated scoring function with output from the machine-learning system when executed with the training data value (col. 8 lines 61-65 recite that certain performance indicators can be generated or updated based on data collected after a predictive model has been executed. For example, accuracy, stability, and complexity metrics can be generated or updated after execution of a predictive model (i.e. the scoring function can be executed with output from the machine-learning system)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by applying the scoring function from Lin to the machine-learning training system from Holtham (as modified by Ferguson). Holtham and Lin are both directed to methods of predicting outputs using machine learning techniques, but Holtham only teaches a scoring function in the context of predicting matching documents in the third embodiment. One of ordinary skill would be motivated to use the scoring function from Lin on the outputs from the first and second embodiments in Holtham as well in order to find the best results from the training for use in another round of training or for future use.
Regarding claim 11, the combination of Holtham, Ferguson, and Lin teaches the non-transitory computer-readable storage media according to claim 10, wherein the training further comprises updating the machine-learning system based on results of the executed scoring function (Lin col. 8 lines 37-44 recite that respective performance indicators corresponding to a set of predictive models can be compared to determine which predictive models should be selected (i.e. updating the system to choose the next data set to be train based on the score of each model)).
Regarding claim 19, the combination of Holtham and Ferguson teaches the method according to claim 14.
However, the combination of Holtham and Ferguson does not explicitly teach executing a scoring function based on output of the machine-learning system; and, updating the machine-learning system based on results of the scoring function.
Lin teaches executing a scoring function based on output of the machine-learning system (col. 8 lines 44-65 recite that a performance indicator can be considered any suitable quantitative measure (e.g., a metric), qualitative designation (e.g., labels such as "highly accurate", "robust", etc.) or ranking which describes the performance of a predictive model. In some implementations, performance indicators can include: accuracy metrics (e.g. predictive error percentages, confidence scores (i.e. a scoring function), etc.) that reflect the tendency of a predictive model to output correct or erroneous predictive outcomes, stability metrics (e.g., runtime error percentages) that reflect the tendency of a predictive model to successfully reach a prediction, flexibility metrics (e.g. number of parameters, highest degree of variable terms, etc.) that reflect the ability of a predictive model to glean complicated patterns from training data having several features per example, complexity metrics ( e.g., average number of required computations) that reflect the computational effort required to execute a predictive model, and other comparable metrics applicable to statistical models. Certain performance indicators can be generated or updated based on data collected after a predictive model has been executed. For example, accuracy, stability, and complexity metrics can be generated or updated after execution of a predictive model (i.e. the scoring function can be executed on the output of the machine-learning system)); and,
updating the machine-learning system based on results of the scoring function (col. 8 lines 37-44 recite that respective performance indicators corresponding to a set of predictive models can be compared to determine which predictive models should be selected (i.e. updating the system to choose the next data set to be train based on the score of each model)).
See claim 10 for motivation to combine.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20200320371 A1 (Baker et al) teaches a system and method for generating and/or augmenting training data for a machine learning model.
US 20180285663 A1 (Viswanathan et al) teaches a method for augmenting a training data set for image analysis.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571)272-8350. The examiner can normally be reached on M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
	/L.M.F./             Examiner, Art Unit 2121                                                                                                                                                                                           

 
/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121