DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 and 13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
U.S.C §101 Rejection
Applicant Argues: 
In the Response to Arguments on page 3 of the Office Action, the Examiner states that the part of the limitation that recites "'while securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system' is understood to be describing the intended results of transmitting the training samples to the computer system. Accordingly, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception". 
In response, claim 1 has been amended to recite as separate steps: 
- sending the training samples to a computer system; 
- determining, by the computer system, a classifier weight; 
- securely keeping the multiple data samples on the data collection device 
by preventing access to the multiple data samples from the computer system for determining the classifier weight. 
As a result, the claim now positively recites these features. 
Further, the sending of training samples is not a post-solution activity because the computer system, to which the training samples are sent, determines a classifier weight from the training samples and then determines a classification based on the classifier weight. As a result, the sending of the training samples is integrated into the claim as a whole. Therefore, the sending is not a post-solution activity but significantly more than an abstract idea. 

Examiner Response: The limitation “sending the training samples to a computer system” is a well-known additional element directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g). The limitation “determining, by the computer system, a classifier weight”, save for the recitation of generic computer equipment (“computer”), this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-2 and 5-27 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract ideas and mathematical concepts without significantly more. 
When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human 2019 PEG for more details of the analysis.
Step 1
	According to the first part of the analysis, in the instant case, claims 1-2, 5-12, and 14- are directed to a method and claim 13 is directed to a system comprising at least a processor. Thus, each of the claims falls within one of the four statutory categories (i.e. process, machine, manufacture, or composition of matter).
Claim 1 recites:
Step 2A, Prong 1
	“A computer implemented method for determining multiple training samples from multiple data samples, each of the multiple data samples comprising one or more feature values and a label that classifies that data sample, the method comprising: determining each of the multiple training samples by randomly selecting a subset of the multiple data samples”. Save for the recitation of generic computer, this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset, by determining a weighted sum of the feature values of the data samples of the subset wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset, wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“determining…, a classifier weight associated with a feature index from the multiple training samples;” (Save for the recitation of generic computer equipment (“computer”), this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.)
“determining…, a classification of the test values based on the classifier weight;” (Save for the recitation of generic computer equipment (“computer”), this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.)
Step 2A, Prong 2
“A computer”. The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“by the computer system” (The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“sending the multiple training samples including the combined feature values to a computer system;” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“receiving, by the computer system, test values; determining, by the computer system, a classification of the test values based on the classifier weight” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
 “securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight.” (This step appears to be directed to storing information, which is understood to be insignificant extra-solution activity.)
Step 2B
“A computer”. The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“by the computer system” (The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“sending the multiple training samples including the combined feature values to a computer system;” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“receiving, by the computer system, test values; determining, by the computer system, a classification of the test values based on the classifier weight” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
 “securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight.” (This step appears to be directed to storing information, which is understood to be insignificant extra-solution activity.)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 2 recites:
Step 2A, Prong 1
	“wherein randomly selecting the subset of the multiple data samples comprises multiplying each of the multiple data samples by a random selection value that is unequal to zero to select that data sample or equal to zero to deselect that data sample.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 5 recites:
Step 2A, Prong 1
	“The method of claim 3- This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 6 recites:
Step 2A, Prong 1
“wherein the weighted sum is weighted based on a random number such that randomly selecting the subset of the multiple data samples is performed simultaneously with combining the feature values.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 7 recites:
Step 2A, Prong 1
“wherein randomly selecting a subset of multiple data samples comprises randomly selecting a subset of multiple data samples based on a non-uniform distribution.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 8 recites:
Step 2A, Prong 1
“wherein the data samples have signed real values as features values, and the label is one of '-1' and '+1'.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 9 recites:
Step 2A, Prong 1
	“wherein determining each of the multiple training samples comprises determining each of the multiple training samples such that each of the multiple training samples is based on at least a predetermined number of data samples.” 
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 10 recites:
Step 2A, Prong 1
	“wherein randomly selecting a subset of the multiple data samples comprises randomly selecting a subset of the multiple data samples that comprises at least a predetermined number of data samples.” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
Step 2A, Prong 2
This claim does not appear to have any additional elements. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 11 recites:
Step 2A, Prong 1
	“receiving a training sample according to claim 1”. This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“determining for each feature value of the training sample a random value and adding the random value to that feature value to determine a modified training sample.” This step is understood to be a recitation of a mathematical concept. 
Claim 12 recites:
Step 2A, Prong 1
This claim does not appear to recite any judicial exceptions.
Step 2A, Prong 2
	“A non-transitory computer readable medium comprising computer-executable instructions stored thereon, that when executed by a processor, causes the processor to perform the method of claim 1.” The “computer readable medium” and “processor” is understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
Step 2B
“ A non-transitory computer readable medium comprising computer-executable instructions stored thereon, that when executed by a processor, causes the processor to perform the method of claim 1.” The “computer readable medium” and “processor” is understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 13 recites:
Step 2A, Prong 1
“… processing the training samples” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“each of the multiple data samples comprising one or more feature values and a label that classifies that data sample.” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“a processor configured to determine each of the multiple training samples by randomly selecting a subset of the multiple data samples”. Save for the recitation of generic computer equipment (“processor”), this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset by determining a weighted sum of the feature values of the data samples of the subset wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset, wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“determine a classifier weight associated with a feature index from the multiple training samples” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“determine a classification of the test values based on the classifier weight” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
Step 2A, Prong 2
	“A system comprising a data collection device for determining multiple training samples from multiple data samples, and a computer system for receiving and processing the training samples” The “computer” and “computing device” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f). )
	“an input port to receive the multiple data samples” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).)
	“computer system” and “data collection device” (The “computer” and “computing device” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
“to send the multiple training samples including the combined feature values to the computer system;” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“receive test values” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“the computer system comprising a processor configured to:” (The “computer” and “processor” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
	“wherein the processor of the data collecting device is further configured to securely keep the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight.” (This step appears to be directed to storing information, which is understood to be insignificant extra-solution activity.)
Step 2B
“A system comprising a data collection device for determining multiple training samples from multiple data samples, and a computer system for receiving and processing the training samples” The “computer” and “computing device” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f). )
	“an input port to receive the multiple data samples” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).)
	“computer system” and “data collection device” (The “computer” and “computing device” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
“to send the multiple training samples including the combined feature values to the computer system;” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“receive test values, and determine a classification of the test values based on the classifier weight;” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
“the computer system comprising a processor configured to:” (The “computer” and “processor” are understood to be generic computer equipment used to execute computer instructions. See MPEP 2106.05(f).)
	“wherein the processor of the data collecting device is further configured to securely keep the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight.” (This step appears to be directed to storing information, which is understood to be insignificant extra-solution activity.)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 14 recites
Step 2A, Prong 1
	“The computer implemented method of claim 1 comprising:” Save for the recitation of a generic computer, this step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
	“determining… a correlation value based on the multiple training values, such that the correlation value is indicative of a correlation between each of the multiple data values and the data label associated with that data value”. This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“determining… the classifier coefficient based on the correlation value, wherein the feature value of a feature of the training sample is a combination of the feature values of that feature of the data samples”. This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
“A computer”. The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“by the computer system” (The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“receiving… multiple training values associated with the feature index, each training value being based on a combination of a subset of multiple data values based on multiple data labels, each of the multiple data labels being associated with one of the multiple data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
Step 2B
“A computer”. The generic computer is understood to be used to execute computer instructions. See MPEP 2106.05(f).)
“receiving multiple training values associated with the feature index, each training value being based on a combination of a subset of multiple data values based on multiple data labels, each of the multiple data labels being associated with one of the multiple data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system” (This step appears to be directed to transmitting or receiving information, which is understood to be insignificant extra-solution activity. See MPEP 2106.05(g).).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 15 recites
Step 2A, Prong 1
	“further comprising determining for each of the multiple training values a training value weight associated with that training value, wherein determining the correlation value is based on the training value weight associated with each of the multiple training values.” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B

Claim 16 recites:
Step 2A, Prong 1
“wherein determining the correlation value comprises determining a sum of training values weighted by the training value weight associated with each of the multiple training values.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 17 recites:
Step 2A, Prong 1
”wherein determining the correlation value comprises: determining a maximum training value”. This step is understood to be a recitation of a mathematical concept. 
“dividing the sum by the maximum training value.”. This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 18 recites:
Step 2A, Prong 1
“wherein determining the training value weight associated with each of the training values comprises determining the training value weight associated with each of the multiple training values based on the correlation value.” This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 19 recites:
Step 2A, Prong 1
“wherein determining each training value weight associated with one of the multiple training values comprises: determining a maximum training value”. This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“determining the training value weight based on a fraction of the one of the multiple training values over the maximum training value.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 20 recites
Step 2A, Prong 1
“comprising performing multiple repetitions of the method to determine multiple classifier coefficients, each classifier coefficient being associated with one of multiple feature indices.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 21 recites:
Step 2A, Prong 1
“wherein determining the training value weight comprises determining the training value weight based on a difference between a first value of a regularization function of a current repetition and a second value of the regularization function of a previous repetition.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 22 recites:
Step 2A, Prong 1
“wherein the regularization function depends on the multiple classifier coefficients associated with the corresponding repetition.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 23 recites:
Step 2A, Prong 1
“wherein determining the training value weight comprises determining the training value weight based on an exponential function having an exponent by adding the difference to the exponent.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 24 recites:
4Richard Nock Step 2A, Prong 1 
“wherein the regularization function comprises one or more of: ridge function; lasso function; L., -regularization; and SLOPE regularisation.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 25 recites:
4Richard Nock Step 2A, Prong 1 
“further comprising selecting the feature index based on an ordering of multiple feature indices, wherein the ordering is based on the difference.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 26 recites:
4Richard Nock Step 2A, Prong 1 
“The method of This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim 27 recites:
4Richard Nock Step 2A, Prong 1 
“receiving test values”. This step appears to be practically implementable in the human mind and is understood to be a recitation of a mental process.
“determining a classification of the test values based on the classifier coefficients.” This step is understood to be a recitation of a mathematical concept. 
Step 2A, Prong 2
This claim does not appear to have any additional elements not already addressed. 
Step 2B
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8, 9, 10, 12, 13, 14, 15, 16, 18, 20, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over by Barbu et al. (US-20080027887-A1; Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Burstein et al. (US-20050143971-A1).
Regarding Claim 1,
Barbu teaches a computer implemented method performed by a data collection device for determining multiple training samples from multiple data samples, each of the multiple data samples comprising one or more feature values and a label that classifies that data sample, the method comprising: 
determining each of the multiple training samples (para [0022] For an example where the three views are Red, Green, and Blue intensities, the views of the point x.sub.i can be represented as [x.sup.R.sub.i, x.sup.G.sub.i, x.sup.B.sub.i].) by 
randomly selecting a subset of the multiple data samples (para [0048]The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set.), and 
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k. In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].)…; 
determining…, a classifier weight associated with a feature index from the multiple training samples (para [0026-0030] In step 33, for each view, the weak classifiers C.sub.k.sup.j are separately trained on training examples sampled based on the weights' distribution. For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k… Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k as); 
Barbu does not explicitly disclose
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset, by determining a weighted sum of the feature values of the data samples of the subset wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset, wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset; 
sending the multiple training samples including the combined feature values to a computer system; 
determining, by the computer system, a classifier weight associated with a feature index from the multiple training samples; 
receiving, by the computer system, test values; 
determining, by the computer system, a classification of the test values based on the classifier weight; and 
securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight.
However, Chen teaches
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset, by determining a weighted sum of the feature values of the data samples of the subset (pg. 388, section 2.3.3; There were three new fusion indices were obtained through the following summation calculations Ca,i =Cax,i +Cay,i +Cash,i Cv,i =Cvx,i +Cvy,i +Cvsh,i Cf,i =Cfx,i +Cfy,i +Cfsh,i (8) In Eq. (8), Ca,i , Cv,i , Cf,i represent the fusion indices of the three detected signals. The feature indexes and fusion indexes are used as input (training samples) into the ANN as shown in figure 1 & figure 4.) wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset… (pg. 385; The definition is given as follows: Cas,i 5SO 10 i51 As,iD/10As,std (3) In Eq. (3), i represents the ith data set. Therefore, three new indices were obtained: Cax,i , Cay,i , Cash,i . For each feature index Cax,i , the cutting experiment was repeated ten times. The average of the ten repeated feature indices was used to represent the real feature index Cax,i . Equation (3) shows feature values corresponding to feature indexes in a summation function.); 

	Doing so would allow for improved data fusion (pg. 381; The convergence speed and the test error were recorded and used to represent the training efficiency and test performance of the different data fusion methods. From an analysis of the results of the calculations based on the experimental data, it was found that the performance of the monitoring  system  could  be  significantly  improved  with  suitable  selection  of  the  data  fusion  method.).
Burstein (US 20050143971 A1) teaches
wherein the… feature values of that feature multiplied by the respective labels of each of the data samples of the subset (Para [0029] The vector may be computed by multiplying each entry of the label vector corresponding to one or more second words proximate to the word by a weighted value.); 
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a classifier of Barbu with the method of training the classifier of Burstein.
Doing so would allow for automatic feature combination (para [0069] A support vector machine (SVM) is a classifying engine that may be applied to a variety of machine learning applications. SVMs may permit automatic feature combination by using similarity scores generated by a Random Indexing module as predictive features.)
Lin teaches
pg. 364; Figure 1(a) shows the application scenario of outsourcing the SVM training with privacy preservation. The data owner sends perturbed training data to the service provider, and then the service provider trains the SVM from the perturbed training data for the data owner.); 
determining, by the computer system, a classifier weight associated with a feature… from the multiple training samples (pg. 365; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem: arg min w,b,ξ 1 2 ||w||2 + C ∑m i=1 ξi subject to yi(w · xi + b) ≥ 1 − ξi, ξi ≥ 0, i = 1, ..., m); 
receiving, by the computer system, test values (pg. 264; The data owner can send the perturbed testing instances to the service provider for privately outsourcing the testing as shown in Figure 1(b).); 
determining, by the computer system, a classification of the test values based on the classifier weight (pg. 365; The resulted classifier is sgn(f(x)) for determining which side of the optimal separating hyperplane the testing instance x falls into.); and 
securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system for determining the classifier weight (pg. 364; The proposed scheme enables the data owner to send the perturbed data to the service provider for outsourcing the SVM training without disclosing the actual content of the data, where the service provider trains SVMs from the perturbed data. Since the service provider may be untrustworthy, the perturbation protects the data privacy by avoiding unauthorized accesses to the sensitive content.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a classifier of Barbu with the method of training a classifier of Lin.
Doing so would allow for preserving data privacy (pg. 363; In this paper, we propose a scheme for privacy-preserving outsourcing the training of the SVM without disclosing the actual content of the data to the service provider.).
Regarding Claim 8,
Barbu, Chen, Lin, and Burstein teach
 teach the method of claim 1.
Barbu further teaches wherein the data samples have signed real values as features values, and the label is one of '-1' and '+1' (para [0021] The set S is represented as S.sub.j=[(x.sup.j.sub.1, y.sub.1), (x.sup.j.sub.2, y.sub.2) . . . (x.sup.j.sub.N, y.sub.N)], where j=1 . . . M and y.sub.i .epsilon. [+1,-1] and each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the i.sup.th training example.).
However, Barbu et al. teaches
wherein determining each of the multiple training samples comprises determining each of the multiple training samples such that each of the multiple training samples is based on at least a predetermined number of data samples (para [0045] The size and dimensions of the five data sets are shown in the following table.).

	Doing so would allow for improved accuracy (para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).
Regarding Claim 9,
Barbu, Chen, Lin, and Burstein teach the method of 
	Barbu et al. further teaches
wherein determining each of the multiple training samples comprises determining each of the multiple training samples such that each of the multiple training samples is based on at least a predetermined number of data samples (para [0045] The size and dimensions of the five data sets are shown in the following table.).
Regarding Claim 10,
Barbu, Chen, Lin, and Burstein teach the method of claim 9. Barbu teaches wherein randomly selecting a subset of the multiple data samples comprises randomly selecting a subset of the multiple data samples that comprises at least a predetermined number of data samples (para [0045] The size and dimensions of the five data sets are shown in the following table.).
Regarding Claim 12,
Barbu, Chen, Lin, and Burstein teach a non-transitory computer readable medium comprising computer-executable instructions stored thereon, that when executed by a para [0036] Other embodiments of the invention include devices such as computer systems and computer-readable media having programs or applications to accomplish the exemplary methods described herein.).
Regarding Claim 13,
Barbu teaches a data collection device for determining multiple training samples from multiple data samples, the computer system comprising: 
an input port to receive the multiple data samples, each of the multiple data samples comprising one or more feature values and a label that classifies that data sample (para [0021] FIG. 2 describes the method of FIG. 1 in greater detail. Input 30 includes a training set S of data of N training points X=[x.sub.1, x.sub.2 . . . x.sub.N]. M disjoint features are available for each point x.sub.i=[x.sup.1.sub.i, x.sup.2.sub.i . . . x.sup.M.sub.i]. Each member x.sup.j.sub.i in the set x.sub.i is known as a view of point x.sub.i.); and 
a processor to determine each of the multiple training samples by randomly selecting a subset of the multiple data samples (para [0048]The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set.), and 
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset (para [0021] Each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the ith training example. Since M disjoint features are available for each point, there will be M training sets. The set S is represented as S.sub.j=[(x.sup.j.sub.1, y.sub.1), (x.sup.j.sub.2, y.sub.2) . . . (x.sup.j.sub.N, y.sub.N)], where j=1 . . . M and y.sub.i .epsilon. [+1,-1] and each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the i.sup.th training example.) wherein the feature value of a feature of the training sample is a combination of the feature values of that feature of the data samples (para [0026] In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N]); and 
…training samples including the combined feature… for determining a classifier weight associated with a feature index from the multiple training samples (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k.)… for determining the classifier weight (para [0020] After a classifier h.sub.k* with lowest error rate is selected and combination weight .alpha..sub.k* is obtained).
Barbu does not explicitly disclose
to send the multiple training samples including the combined feature values to a computer system… while securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system…
However, Lee teaches
to send the multiple training samples including the combined feature values to a computer system (fig. 4; para [0056] In step 242, the privacy manager transmit derived statistical data to a server that aggregates derived statistical data from multiple client devices. And para [0023] The processed data or adaptation data can be a version of the acoustic model used at the server, but only containing information from a single user. Thus, shared general acoustic models can be continually updated without storing personal and confidential audio data on third-party systems for training.)… while securely keeping the multiple data samples on the data collection device by preventing access to the multiple data samples from the computer system (para [0007] In other words, systems herein can collect training data for updating speech models by modified data in such a way that the modified data cannot be used to reconstruct raw utterances, or original messages contained therein, by human or machine.)…
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a model of Barbu with the method of training a model of Lee.
Doing so would allow for updating models without including private information (Abs. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.).
Regarding Claim 14,
Barbu, Chen, Lin, and Burstein teach the computer implemented method of claim 1, comprising: 
para [0024] As an example, for three views (red, green, and blue intensities), N training examples each have three disjoint views and a given training example x.sub.i can be represented as [x.sup.R.sub.i, x.sup.G.sub.i, x.sup.B.sub.i].), each training value being based on a combination of a subset of multiple data values… based on multiple data labels, each of the multiple data labels being associated with one of the multiple data values (para [0021] Each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the ith training example. Since M disjoint features are available for each point, there will be M training sets. The set S is represented as S.sub.j=[(x.sup.j.sub.1, y.sub.1), (x.sup.j.sub.2, y.sub.2) . . . (x.sup.j.sub.N, y.sub.N)], where j=1 . . . M and y.sub.i .epsilon. [+1,-1] and each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the i.sup.th training example.);  Page 4 of 15Appl. No. 15/521,441Response to Office Action of November 29, 2019 
determining, by the computer system,  a correlation value based on the multiple training values, such that the correlation value is indicative of a correlation between each of the multiple data values and the data label associated with that data value (para [0029] Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k as Examiner note: The error rate indicates the correlation between a label and its associated data value.); and 
determining, by the computer system, the classifier coefficient based on the correlation value (para [0029] .alpha. k * = 1 2 ln ( 1 - k * k * ) . ##EQU00001## Examiner note: The error rate is used to calculate .alpha. k *. which is a coefficient for the classifier Para [0033] F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##), wherein the feature value of a feature of the training sample is a combination of the feature values of that feature of the data samples (para [0026] In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N]).
Barbu does not explicitly disclose
data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system.
However, Lee teaches 
…data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system (para [0007] In other words, systems herein can collect training data for updating speech models by modified data in such a way that the modified data cannot be used to reconstruct raw utterances, or original messages contained therein, by human or machine.)…
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a model of Barbu with the method of training a model of Lee.
Doing so would allow for updating models without including private information (Abs. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.).
Regarding Claim 15,
para [0024] The initial weight for each view will be w.sup.R.sub.i(i)=w.sup.G.sub.i(i)=w.sup.B.sub.1(i)=1/N.), wherein determining the correlation value is based on the training value weight associated with each of the multiple training values (para [0029] Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k).
Regarding Claim 16,
Barbu, Chen, Lin, and Burstein teach the method of claim 15. Barbu further teaches wherein determining the correlation value comprises determining a sum of training values (para [0033] After k.sub.max iterations, a decision function F(x) is found as F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##. Examiner note: Summation function of training values.) weighted by the training value weight associated with each of the multiple training values (para [0032] the sampling weight of the R, G and B views of example x.sub.i in iteration k are given by w.sup.R,G,B.sub.k(i)=w.sup.R.sub.k(i)=w.sup.G.sub.k(i)=w.sup.B.sub.k(i).).
Regarding Claim 18,
Barbu, Chen, Lin, and Burstein teach the method of claim 15. Barbu further teaches wherein determining the training value weight associated with each of the training values comprises determining the training value weight associated with each of the multiple training values based on the correlation value (para [0030-0031] In step 38, update weights of the views as w k + 1 ( i ) = w k ( i ) Z k * .times. [ exp ( - .alpha. k * ) if h k * ( x i * ) = y i exp ( .alpha. k * ) if h k * ( x i * ) .noteq. y i ##EQU00002## where h*.sub.k is the classifier with lowest error rate .epsilon.*.sub.k in the k.sup.th iteration.).
Regarding Claim 20,
Barbu, Chen, Lin, and Burstein teach the method of para [0033] After k.sub.max iterations, a decision function F(x) is found as 
F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##), each classifier coefficient being associated with one of multiple feature indices (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k. In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].).
Regarding Claim 27,
Barbu, Chen, Lin, and Burstein teach the method of claim 1
receiving test values (para [0019] The final ensemble contains learners that are trained to focus on different views of the test data.); and 
para [0048] Results for data sets 1-5 are illustrated in the tables on FIG. 3 and 4. The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set. The average accuracy of individual classifier from each view before fusion is shown in the columns A.sub.R, A.sub.G, and A.sub.B, for red, green, and blue, respectively.).  

Claim 2 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1; hereinafter Barbu) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Burstein et al. (US-20050143971-A1), and Servedio et al. (US-8972307-B1).
Regarding Claim 2,
Barbu, Chen, Lin, and Burstein teach the method of claim 1.
	Barbu, Chen, Lin, and Burstein does not explicitly disclose
	wherein randomly selecting the subset of the multiple data samples comprises multiplying each of the multiple data samples by a random selection value that is unequal to zero to select that data sample or equal to zero to deselect that data sample.
	However, Servedio et al. teaches
wherein randomly selecting the subset of the multiple data samples comprises multiplying each of the multiple data samples by a random selection value that is unequal to zero to select that data sample or equal to zero to deselect that data sample (Col. 5 lines 44-53; The random unit vectors are used to compute random origin-centered halfspaces h.sub.1 . . . h.sub.k, where h.sub.1=sign (v.sub.1x) and h.sub.k=sign (v.sub.kx)….For example, row 52c contains unit vector v.sub.3, whose elements (0.9, 0.9, -0.8, 0.2, -0.3, 0.4) are in columns 54. When v.sub.3 is applied to an example in the set of examples (such as examples x.sub.1, . . . , x.sub.N in FIG. 3), each element of the unit vector v.sub.3 will be multiplied by the corresponding element of the examples, such as in the following illustration which depicts the vector multiplication of v.sub.3 and x.sub.1: (0.9,0.9,-0.8,0.2,-0.3,0.4).times.(0,0,1,0,1,1)).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Barbu with the method of training classifiers of Servedio et al. 
Doing so would allow for reducing error (Col. 3 lines 7-10; Malicious noise can include examples that are incorrectly labeled and thus can tend to mislead the machine learner, resulting in classifiers that generate too many erroneous).

Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1; hereinafter Barbu) in view of Lee et al. (US-20140129226-A1; hereinafter Lee) and Bhardwaj et al. (US-20150063688-A1).
Regarding Claim 5,
Barbu, Chen, Lin, Burstein, and Bhardwaj et al. teach the method of claim 3.
	Barbu et al. further teaches
wherein determining the sum comprises determining a weighted sum (para [0033] After k.sub.max iterations, a decision function F(x) is found as 
F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003## Examiner note: weights summed in the summation function) that is weighted based on the number of data samples in the subset of the multiple data samples (para [0025] In a number of iterations from k=1 to k=k.sub.max, the training set is sampled 32 according to the weights of the training points.).
Regarding Claim 6,
Barbu, Chen, Lin, Burstein, and Bhardwaj et al. teach the method of claim para [0035] In each iteration, sampling and weight update is performed using a shared sampling distribution. As a result, the weights for all views of a given training example are updated according to the opinion of the classifier from the lowest error view.).

Claim 7 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Murray et al. (US-20130315477-A1) in view of Barbu et al. (US-20080027887-A1), Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Burstein et al. (US-20050143971-A1).
Regarding Claim 7,
Barbu, Chen, Lin, and Burstein teach the method of 
Murray et al. teaches wherein randomly selecting a subset of multiple data samples comprises randomly selecting a subset of multiple data samples based on a non-uniform distribution (para [0073] For example, in the case of visual features, the classifier component 30 may include a patch extractor, which extracts and analyzes content-related features of patches of the image 22, 24, such as shape, texture, color, or the like. The patches can be obtained by image segmentation, by applying specific interest point detectors, by considering a regular grid, or simply by random sampling of image patches.).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Murray et al. with the method of training classifiers of Barbu et al.
	Doing so would allow for improved accuracy (para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).
Regarding Claim 26,
Barbu, Chen, Lin, and Burstein teach the method of para [0133] For all 4 feature types, the best results were achieved for a classifier learned by stochastic gradient descent with a regularization parameter (for optimizing the multivariate loss function) of 10.sup.-3. For color and SIFT features, best results were achieved using a 3.times.3 spatial pyramid and the entire image.).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Murray et al. with the method of training classifiers of Barbu et al.
para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Burstein et al. (US-20050143971-A1), and Ma et al. (US-20070239444-A1).
Regarding Claim 11,
Barbu, Chen, Lin, and Burstein teach a computer implemented method for determining multiple training samples, the method comprising: 
receiving a training sample according to claim 1 (para [0084] The features extracted from each training image 22 can be aggregated (e.g., concatenated) into a single image representation of the image which is input to the classifier (or set of binary classifiers) along with its label.); and 
Barbu does not explicitly disclose
determining for each feature value of the training sample a random value and adding the random value to that feature value to determine a modified training sample
However, Ma teaches
determining for each feature value of the training sample a random value and adding the random value to that feature value to determine a modified training sample para [0022] For example, the processor 130 can add random noise to the feature vector to artificially extend the numeric range of the features.).
It would have been obvious to persons’ having ordinary skill in the art to combine the feature vectors of Murray et al. with the method of adding perturbation to feature vectors of Ma et al.
Doing so would allow for creating a more robust model (para [0027] Certain coefficient sets are more robust to noise, dynamic range, precision, and scaling.).

Claims 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Burstein et al. (US-20050143971-A1), and Jabo (“Machine Vision for Wood Defect Detection and Classification”).
Regarding Claim 17,
Barbu, Chen, Lin, and Burstein teach the method of claim 16.
	Barbu, Chen, Lin, and Burstein does not explicitly disclose
wherein determining the correlation value comprises: determining a maximum training value; and dividing the sum by the maximum training value.
However, Jabo teaches
 wherein determining the correlation value comprises: 
determining a maximum training value (pg. 14; 𝑉 ≔ max 𝑅,𝐺,𝐵); and 
dividing the sum by the maximum training value (pg. 13; 𝑟 ≔ 𝑉−𝑅 𝑉−𝑋 , 𝑔 ≔ 𝑉−𝐺 𝑉−𝑋 , 𝑏 ≔ 𝑉−𝐵 𝑉−<).

Doing so would allow for object recognition (pg. 3 Object recognition in computer science requires a translation of these features into a numerical form. These feature values will then be processed in a decision making program which is called the classifier. The main purpose of a classifier is to associate each feature sample with a class label.).
Regarding Claim 19,
Barbu, Chen, Lin, and Burstein teach the method of claim 18. 
Barbu, Chen, Lin, and Burstein do not explicitly disclose
	wherein determining each training value weight associated with one of the multiple training values comprises: determining a maximum training value; and determining the training value weight based on a fraction of the one of the multiple training values over the maximum training value.  4Richard NockPreliminary Amendment
However, Jabo teaches
wherein determining each training value weight associated with one of the multiple training values comprises: 
determining a maximum training value (pg. 14; 𝑉 ≔ max 𝑅,𝐺,𝐵); and 
determining the training value weight based on a fraction of the one of the multiple training values over the maximum training value (pg. 18; The weight vector is updated according to 𝑤𝑖 𝑡+1 = 𝐷𝑡 𝑖 𝑒 −𝛼𝑡𝑕𝑡 𝑥𝑖 𝑦𝑖 𝑍𝑡 (10) Where, 𝐷𝑡 𝑖 is the weight distribution over the sample 𝑖 in the present round and 𝑍𝑡 is a normalization factor so that 𝑤𝑖 𝑡+1 will be a normalized weight distribution in next round. 𝑍𝑡 = 𝐷𝑡 𝑖 𝑒 −𝛼𝑡𝑕𝑡 𝑥𝑖 𝑦𝑖 𝑁 𝑖=1 (11)).

Claim 21, 22, 23, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Burstein et al. (US-20050143971-A1), and Beymer et al. (US-20150278707-A1).
Regarding Claim 21,
Barbu, Chen, Lin, and Burstein teach the method of claim 20.
Barbu, Chen, Lin, and Burstein do not explicitly disclose
wherein determining the training value weight comprises determining the training value weight based on a difference between a first value of a regularization function of a current repetition and a second value of the regularization function of a previous repetition
However, Beymer et al. teaches 
wherein determining the training value weight comprises determining the training value weight based on a difference between a first value of a regularization function of a current repetition and a second value of the regularization function of a previous repetition (para [0020] Accordingly, the cost function can be regularized using 
i = 1 N j = 1 N D ( C ( d i ) , C ( d j ) ) ( .alpha. i - .alpha. j ) 2 and ##EQU00003## p = 1 T q = 1 T D ( e ( C p ) , e ( C q ) ) ( .beta. p - .beta. q ) 2 . ##EQU00003.2##).

	Doing so would allow for preventing overfitting (para [0019] In exemplary embodiments, regularization is needed to solve this under-determined linear system and prevent overfitting.).
Regarding Claim 22,
Barbu, Chen, Lin, Burstein and Beymer et al. teach the method of claim 21. Beymer et al. further teaches wherein the regularization function depends on the multiple classifier coefficients associated with the corresponding repetition (para [0019] In addition, the behavioral similarity of each weak classifier can be identified by investigating the individual weak learner's output from the whole training data set. For example, if the two classifiers C.sub.p and C.sub.q have similar outputs, similar weights .beta..sub.p and .beta..sub.q can be assigned to them (the p,q.sup.th columns in FIG. 2).).
Regarding Claim 23,
Barbu, Chen, Lin, Burstein, and Beymer et al. teach the method of claim 21. Barbu et al. further teaches wherein determining the training value weight comprises determining the training value weight based on an exponential function having an exponent by adding the difference to the exponent (para [0030] In step 38, update weights of the views as 
w k + 1 ( i ) = w k ( i ) Z k * .times. [ exp ( - .alpha. k * ) if h k * ( x i * ) = y i exp ( .alpha. k * ) if h k * ( x i * ) .noteq. y i ##EQU00002##).
Regarding Claim 25,
Barbu, Chen, Lin, Burstein, and Beymer et al. teach the method of para [0026] In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].).
Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Burstein et al. (US-20050143971-A1), Beymer et al. (US-20150278707-A1), and Narsky et al. (US-9501749-B1).
Regarding Claim 24,
Barbu, Chen, Lin, Burstein and Beymer et al. teach the method of 
Barbu, Chen, Lin, Burstein and Beymer et al. do not explicitly disclose
 wherein the regularization function comprises one or more of:  
	ridge function;
lasso function; 
L., -regularization; and 
SLOPE regularisation.

wherein the regularization function comprises one or more of:  
	ridge function;
lasso function (Col. 14 lines 35-37; In some implementations, the framework may provide pruning for regression ensembles by a lasso technique.); 
L., -regularization; and 
SLOPE regularisation.
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the regularization function of Beymer et al. with the lasso technique of Narsky.
Doing so would allow for reducing ensemble size (Col. 14 lines 46-49; The framework may execute the shrink method to reduce the ensemble size by removing learners with optimized weights below a certain threshold.).
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217.  The examiner can normally be reached on Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/HENRY NGUYEN/Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121