DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Acknowledgement is made of Applicant’s claim amendments on 08/04/2021. The claim amendments are entered. Presently, claims 1-2 and 5-27 remain pending. Claims 1 and 13 have been amended.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 and 13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-2 and 5-27 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1 and 13 recites the limitation “training, by the computer system, a classifier on the confidential data without access to the confidential data and with access to the non-confidential training data”. It is unclear how a classifier can be trained on “confidential data” without access to the “confidential data. Claims 2, 5-12, and 14-27 are dependent claims that do not cure the deficiencies and are rejected for the same reasons.s
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8, 9, 10, 12, 13, 14, 15, 16, 18, 20, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over by Barbu et al. (US-20080027887-A1; hereinafter Barbu) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Barnhill et al. (US-6789069-B1).
Regarding Claim 1,
Barbu teaches a computer implemented method for training a classifier on confidential data, the method comprising 
collecting the …data as multiple data samples on a user device by way of a user interface (para [0036] Other embodiments of the invention include devices such as computer systems and computer-readable media having programs or applications to accomplish the exemplary methods described herein.), each of the multiple data samples comprising one or more feature values and a label that classifies that data sample (para [0022] For an example where the three views are Red, Green, and Blue intensities, the views of the point x.sub.i can be represented as [x.sup.R.sub.i, x.sup.G.sub.i, x.sup.B.sub.i].); 
creating non-confidential training data by determining each of multiple training samples from the multiple data samples, each of the multiple training samples preserving the privacy of the confidential data by 
randomly selecting a subset of the multiple data samples (para [0048]The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set.), and 
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k. In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].), 
calculating, Page 2 of 16Appl. No. 15/521,441Attorney Docket No.: 096753-1047365 Amdt. dated August 4, 2021 Response to Office Action of March 25, 2021 by the computer system, a classifier weight associated with a feature index from the multiple training samples (para [0026-0030] In step 33, for each view, the weak classifiers C.sub.k.sup.j are separately trained on training examples sampled based on the weights' distribution. For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k… Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k as); and 
Barbu does not explicitly disclose
collecting the confidential data as multiple data samples on a user device by way of a user interface,
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset, by determining a weighted sum of the feature values of the data samples of the subset wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset, wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset; 
non-confidential training data including the multiple training samples including the combined feature values to a computer system for determining the classifier weight while maintaining privacy of the confidential data by preventing access to the confidential data from the computer system; 
training, by the computer system, a classifier on the confidential data without access to the confidential data and with access to the non-confidential training data by 
classifying, by the computer system using the trained classifier, a test value by determining, by the computer system, a classification of the test values based on the classifier weight;
However, Chen teaches
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset, by determining a weighted sum of the feature values of the data samples of the subset (pg. 388, section 2.3.3; There were three new fusion indices were obtained through the following summation calculations Ca,i =Cax,i +Cay,i +Cash,i Cv,i =Cvx,i +Cvy,i +Cvsh,i Cf,i =Cfx,i +Cfy,i +Cfsh,i (8) In Eq. (8), Ca,i , Cv,i , Cf,i represent the fusion indices of the three detected signals. The feature indexes and fusion indexes are used as input (training samples) into the ANN as shown in figure 1 & figure 4.) wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset… (pg. 385; The definition is given as follows: Cas,i 5SO 10 i51 As,iD/10As,std (3) In Eq. (3), i represents the ith data set. Therefore, three new indices were obtained: Cax,i , Cay,i , Cash,i . For each feature index Cax,i , the cutting experiment was repeated ten times. The average of the ten repeated feature indices was used to represent the real feature index Cax,i . Equation (3) shows feature values corresponding to feature indexes in a summation function.); 
Lin teaches
collecting the confidential data as multiple data samples on a user device by way of a user interface (pg. 365, section 2.1; Except the works of geometric perturbation [6,7], existing privacy-preserving SVM works do not address the outsourcing of the SVM. The work of [18] considered the problem of releasing a built SVM classifier without revealing the support vectors, which is a subset of the training data.), 
sending the non-confidential training data including the multiple training samples including the combined feature values to a computer system (pg. 364; Figure 1(a) shows the application scenario of outsourcing the SVM training with privacy preservation. The data owner sends perturbed training data to the service provider, and then the service provider trains the SVM from the perturbed training data for the data owner.) for determining the classifier weight while maintaining privacy of the confidential data by preventing access to the confidential data from the computer system (pg. 365; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem: arg min w,b,ξ 1 2 ||w||2 + C ∑m i=1 ξi subject to yi(w · xi + b) ≥ 1 − ξi, ξi ≥ 0, i = 1, ..., m); 
training, by the computer system, a classifier on the confidential data without access to the confidential data and with access to the non-confidential training data (pg. 364; The proposed scheme enables the data owner to send the perturbed data to the service provider for outsourcing the SVM training without disclosing the actual content of the data, where the service provider trains SVMs from the perturbed data. Since the service provider may be untrustworthy, the perturbation protects the data privacy by avoiding unauthorized accesses to the sensitive content.) by 
classifying, by the computer system using the trained classifier, a test value by determining, by the computer system, a classification of the test values (pg. 365, section 2.2; The resulted classifier is sgn(f(x)) for determining which side of the optimal separating hyperplane the testing instance x falls into.) based on the classifier weight (pg. 365, section 2.2; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem. W denotes the weight.);
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a classifier of Barbu with the method of training a classifier of Lin.
Doing so would allow for preserving data privacy (pg. 363; In this paper, we propose a scheme for privacy-preserving outsourcing the training of the SVM without disclosing the actual content of the data to the service provider.).
Barnhill (US 6789069 B1) teaches
	…wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset (Col. 31 lines 30-35; where the summations run over all training patterns x.sub.i that are vectors of features (genes), x.sub.i.x.sub.j denotes the scalar product, y.sub.i encodes the class label as a binary value +1 or -1, .delta..sub.ij is the Kronecker symbol (.delta..sub.ij =1 if i=j and 0 otherwise), and .zeta. and C are positive constants (soft margin parameters).);
	It would have been obvious to one of ordinary skill in the art before the effective filing date to combine training a classifier of Barbu with the method of training a classifier of Barnhill.
	Doing so would allow for improved algorithm efficiency (Col. 19 lines 37-42; The optimal categorization method 300 takes advantage of dynamic programming techniques. As is known in the art, dynamic programming techniques may be used to significantly improve the efficiency of solving certain complex problems through carefully structuring an algorithm to reduce redundant calculations.)
Regarding Claim 8,
Barbu, Chen, Lin, and Barnhill teach
 teach the method of claim 1.
Barbu further teaches wherein the data samples have signed real values as features values, and the label is one of '-1' and '+1' (para [0021] The set S is represented as S.sub.j=[(x.sup.j.sub.1, y.sub.1), (x.sup.j.sub.2, y.sub.2) . . . (x.sup.j.sub.N, y.sub.N)], where j=1 . . . M and y.sub.i .epsilon. [+1,-1] and each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the i.sup.th training example.).
However, Barbu et al. teaches
para [0045] The size and dimensions of the five data sets are shown in the following table.).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Murray et al. with the method of training classifiers of Barbu et al.
	Doing so would allow for improved accuracy (para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).
Regarding Claim 9,
Barbu, Chen, Lin, and Barnhill teach the method of 
	Barbu et al. further teaches
wherein determining each of the multiple training samples comprises determining each of the multiple training samples such that each of the multiple training samples is based on at least a predetermined number of data samples (para [0045] The size and dimensions of the five data sets are shown in the following table.).
Regarding Claim 10,
Barbu, Chen, Lin, and Barnhill teach the method of claim 9. Barbu teaches wherein randomly selecting a subset of the multiple data samples comprises randomly selecting a subset of the multiple data samples that comprises at least a predetermined number para [0045] The size and dimensions of the five data sets are shown in the following table.).
Regarding Claim 12,
Barbu, Chen, Lin, and Barnhill teach a non-transitory computer readable medium comprising computer-executable instructions stored thereon, that when executed by a processor, causes the processor to perform the method of claim 1 (para [0036] Other embodiments of the invention include devices such as computer systems and computer-readable media having programs or applications to accomplish the exemplary methods described herein.).
Regarding Claim 13,
Barbu teaches a system for training a classifier on confidential data, the system comprising a data collection device for determining multiple training samples from multiple data samples, and a computer system for receiving and processing the training samples, the data collection device comprising: 
an input port to receive the multiple data samples (para [0021] FIG. 2 describes the method of FIG. 1 in greater detail. Input 30 includes a training set S of data of N training points X=[x.sub.1, x.sub.2 . . . x.sub.N]. M disjoint features are available for each point x.sub.i=[x.sup.1.sub.i, x.sup.2.sub.i . . . x.sup.M.sub.i]. Each member x.sup.j.sub.i in the set x.sub.i is known as a view of point x.sub.i.), 
and a processor configured to Page 4 of 16Appl. No. 15/521,441Attorney Docket No.: 096753-1047365 Amdt. dated August 4, 2021 
Response to Office Action of March 25, 2021 collect the …data as the multiple data samples by way of a user interface, each of the multiple data samples comprising one or more feature values and a label that classifies that data sample (para [0022] For an example where the three views are Red, Green, and Blue intensities, the views of the point x.sub.i can be represented as [x.sup.R.sub.i, x.sup.G.sub.i, x.sup.B.sub.i].); 
create non-confidential training data by determining each of the multiple training samples from the multiple data samples, each of the multiple training samples preserving the privacy of the confidential data by 
randomly selecting a subset of the multiple data samples (para [0048]The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set.), and 
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k. In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].) 
Barbu does not explicitly disclose
		collect the confidential data as the multiple data samples by way of a user 
interface,
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset by determining a weighted 
to send the non-confidential training data including the multiple training samples including the combined feature values to the computer system for determining the classifier weight while maintaining privacy of the confidential data by preventing access to the confidential data from the computer system; the computer system comprising a processor configured to: 
train, by the computer system, a classifier on the confidential data without access to the confidential data and with access to the non-confidential training data by calculating  a classifier weight associated with a feature index from the multiple training samples, and   
classify, by the computer system using the trained classifier, a test value by determining a classification of the test values based on the classifier weight.
However, Chen teaches 
combining the feature values of the data samples of the subset based on the label of each of the data samples of the subset (pg. 388, section 2.3.3; There were three new fusion indices were obtained through the following summation calculations Ca,i =Cax,i +Cay,i +Cash,i Cv,i =Cvx,i +Cvy,i +Cvsh,i Cf,i =Cfx,i +Cfy,i +Cfsh,i (8) In Eq. (8), Ca,i , Cv,i , Cf,i represent the fusion indices of the three detected signals. The feature indexes and fusion indexes are used as input (training samples) into the ANN as shown in figure 1 & figure 4.) by determining a weighted sum of the feature values of the data samples of the subset wherein the feature value of a feature of the training sample is the weighted sum of the feature values of that feature of the data samples of the subset (pg. 385; The definition is given as follows: Cas,i 5SO 10 i51 As,iD/10As,std (3) In Eq. (3), i represents the ith data set. Therefore, three new indices were obtained: Cax,i , Cay,i , Cash,i . For each feature index Cax,i , the cutting experiment was repeated ten times. The average of the ten repeated feature indices was used to represent the real feature index Cax,i . Equation (3) shows feature values corresponding to feature indexes in a summation function.),
Lin teaches
	collect the confidential data as the multiple data samples by way of a user 
interface (pg. 365, section 2.1; Except the works of geometric perturbation [6,7], existing privacy-preserving SVM works do not address the outsourcing of the SVM. The work of [18] considered the problem of releasing a built SVM classifier without revealing the support vectors, which is a subset of the training data.),
to send the non-confidential training data including the multiple training samples including the combined feature values to the computer system (pg. 364; Figure 1(a) shows the application scenario of outsourcing the SVM training with privacy preservation. The data owner sends perturbed training data to the service provider, and then the service provider trains the SVM from the perturbed training data for the data owner.) for determining the classifier weight while maintaining privacy of the confidential data by preventing access to the confidential data from the computer system (pg. 365; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem: arg min w,b,ξ 1 2 ||w||2 + C ∑m i=1 ξi subject to yi(w · xi + b) ≥ 1 − ξi, ξi ≥ 0, i = 1, ..., m); the computer system comprising a processor configured to: 
train, by the computer system, a classifier on the confidential data without access to the confidential data and with access to the non-confidential training data (pg. 364; The proposed scheme enables the data owner to send the perturbed data to the service provider for outsourcing the SVM training without disclosing the actual content of the data, where the service provider trains SVMs from the perturbed data. Since the service provider may be untrustworthy, the perturbation protects the data privacy by avoiding unauthorized accesses to the sensitive content.) by calculating  a classifier weight associated with a feature index from the multiple training samples (pg. 365; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem: arg min w,b,ξ 1 2 ||w||2 + C ∑m i=1 ξi subject to yi(w · xi + b) ≥ 1 − ξi, ξi ≥ 0, i = 1, ..., m), and   
classify, by the computer system using the trained classifier, a test value by determining a classification of the test values (pg. 365, section 2.2; The resulted classifier is sgn(f(x)) for determining which side of the optimal separating hyperplane the testing instance x falls into.) based on the classifier weight (pg. 365, section 2.2; The SVM finds the optimal separating hyperplane w · x + b = 0 to obtain the decision function f(x) = w · x + b by solving the following quadratic programming optimization problem. W denotes the weight.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of training a classifier of Barbu with the method of training a classifier of Lin.
Doing so would allow for preserving data privacy (pg. 363; In this paper, we propose a scheme for privacy-preserving outsourcing the training of the SVM without disclosing the actual content of the data to the service provider.).
	Barnhill (US 6789069 B1) teaches
	… wherein the weighted sum comprises a sum of feature values of that feature multiplied by the respective labels of each of the data samples of the subset (Col. 31 lines 30-35; where the summations run over all training patterns x.sub.i that are vectors of features (genes), x.sub.i.x.sub.j denotes the scalar product, y.sub.i encodes the class label as a binary value +1 or -1, .delta..sub.ij is the Kronecker symbol (.delta..sub.ij =1 if i=j and 0 otherwise), and .zeta. and C are positive constants (soft margin parameters).);
	It would have been obvious to one of ordinary skill in the art before the effective filing date to combine training a classifier of Barbu with the method of training a classifier of Barnhill.
Col. 19 lines 37-42; The optimal categorization method 300 takes advantage of dynamic programming techniques. As is known in the art, dynamic programming techniques may be used to significantly improve the efficiency of solving certain complex problems through carefully structuring an algorithm to reduce redundant calculations.)
Regarding Claim 14,
Barbu, Chen, Lin, and Barnhill teach the computer implemented method of claim 1, comprising: 
receiving, by the computer system, multiple training values associated with the feature index (para [0024] As an example, for three views (red, green, and blue intensities), N training examples each have three disjoint views and a given training example x.sub.i can be represented as [x.sup.R.sub.i, x.sup.G.sub.i, x.sup.B.sub.i].), each training value being based on a combination of a subset of multiple data values… based on multiple data labels, each of the multiple data labels being associated with one of the multiple data values (para [0021] Each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the ith training example. Since M disjoint features are available for each point, there will be M training sets. The set S is represented as S.sub.j=[(x.sup.j.sub.1, y.sub.1), (x.sup.j.sub.2, y.sub.2) . . . (x.sup.j.sub.N, y.sub.N)], where j=1 . . . M and y.sub.i .epsilon. [+1,-1] and each (x.sup.j.sub.i,y.sub.i) pair represents the j.sup.th view and class label of the i.sup.th training example.);  Page 4 of 15Appl. No. 15/521,441Response to Office Action of November 29, 2019 
determining, by the computer system,  a correlation value based on the multiple training values, such that the correlation value is indicative of a correlation between para [0029] Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k as Examiner note: The error rate indicates the correlation between a label and its associated data value.); and 
determining, by the computer system, the classifier coefficient based on the correlation value (para [0029] .alpha. k * = 1 2 ln ( 1 - k * k * ) . ##EQU00001## Examiner note: The error rate is used to calculate .alpha. k *. which is a coefficient for the classifier Para [0033] F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##), wherein the feature value of a feature of the training sample is a combination of the feature values of that feature of the data samples (para [0026] In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N]).
Barbu does not explicitly disclose
data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system.
However, Lee teaches 
…data values that are kept securely on a data collection device by preventing access to the multiple data samples from the computer system (para [0007] In other words, systems herein can collect training data for updating speech models by modified data in such a way that the modified data cannot be used to reconstruct raw utterances, or original messages contained therein, by human or machine.)…

Doing so would allow for updating models without including private information (Abs. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.).
Regarding Claim 15,
Barbu, Chen, Lin, and Barnhill teach the method of claim 14. Barbu further teaches further comprising determining for each of the multiple training values a training value weight associated with that training value (para [0024] The initial weight for each view will be w.sup.R.sub.i(i)=w.sup.G.sub.i(i)=w.sup.B.sub.1(i)=1/N.), wherein determining the correlation value is based on the training value weight associated with each of the multiple training values (para [0029] Next, in step 37, for the lowest error rate .epsilon.*.sub.k at iteration k calculate a combination weight value .alpha.*.sub.k).
Regarding Claim 16,
Barbu, Chen, Lin, and Barnhill teach the method of claim 15. Barbu further teaches wherein determining the correlation value comprises determining a sum of training values (para [0033] After k.sub.max iterations, a decision function F(x) is found as F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##. Examiner note: Summation function of training values.) weighted by the training value weight associated with each of the multiple training values (para [0032] the sampling weight of the R, G and B views of example x.sub.i in iteration k are given by w.sup.R,G,B.sub.k(i)=w.sup.R.sub.k(i)=w.sup.G.sub.k(i)=w.sup.B.sub.k(i).).
Regarding Claim 18,
Barbu, Chen, Lin, and Barnhill teach the method of claim 15. Barbu further teaches wherein determining the training value weight associated with each of the training values comprises determining the training value weight associated with each of the multiple training values based on the correlation value (para [0030-0031] In step 38, update weights of the views as w k + 1 ( i ) = w k ( i ) Z k * .times. [ exp ( - .alpha. k * ) if h k * ( x i * ) = y i exp ( .alpha. k * ) if h k * ( x i * ) .noteq. y i ##EQU00002## where h*.sub.k is the classifier with lowest error rate .epsilon.*.sub.k in the k.sup.th iteration.).
Regarding Claim 20,
Barbu, Chen, Lin, and Barnhill teach the method of para [0033] After k.sub.max iterations, a decision function F(x) is found as 
F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003##), each classifier coefficient being associated with one of multiple feature indices (para [0026] For example, for each view j, train classifiers h.sup.j.sub.k on data sets sampled using weights distribution W.sub.k. In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].).
Regarding Claim 27,
Barbu, Chen, Lin, and Barnhill teach the method of claim 1
receiving test values (para [0019] The final ensemble contains learners that are trained to focus on different views of the test data.); and 
determining a classification of the test values based on the classifier coefficients (  para [0048] Results for data sets 1-5 are illustrated in the tables on FIG. 3 and 4. The results are the average accuracy of 20 tests, each time the datasets being randomly partitioned such that 60% of the data is in the training set and the remaining 40% is in the test set. The average accuracy of individual classifier from each view before fusion is shown in the columns A.sub.R, A.sub.G, and A.sub.B, for red, green, and blue, respectively.).  

Claim 2 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1; hereinafter Barbu) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Barnhill et al. (US-6789069-B1), and Servedio et al. (US-8972307-B1).
Regarding Claim 2,
Barbu, Chen, Lin, and Barnhill teach the method of claim 1.
	Barbu, Chen, Lin, and Barnhill does not explicitly disclose

	However, Servedio et al. teaches
wherein randomly selecting the subset of the multiple data samples comprises multiplying each of the multiple data samples by a random selection value that is unequal to zero to select that data sample or equal to zero to deselect that data sample (Col. 5 lines 44-53; The random unit vectors are used to compute random origin-centered halfspaces h.sub.1 . . . h.sub.k, where h.sub.1=sign (v.sub.1x) and h.sub.k=sign (v.sub.kx)….For example, row 52c contains unit vector v.sub.3, whose elements (0.9, 0.9, -0.8, 0.2, -0.3, 0.4) are in columns 54. When v.sub.3 is applied to an example in the set of examples (such as examples x.sub.1, . . . , x.sub.N in FIG. 3), each element of the unit vector v.sub.3 will be multiplied by the corresponding element of the examples, such as in the following illustration which depicts the vector multiplication of v.sub.3 and x.sub.1: (0.9,0.9,-0.8,0.2,-0.3,0.4).times.(0,0,1,0,1,1)).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Barbu with the method of training classifiers of Servedio et al. 
Doing so would allow for reducing error (Col. 3 lines 7-10; Malicious noise can include examples that are incorrectly labeled and thus can tend to mislead the machine learner, resulting in classifiers that generate too many erroneous).

Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1; hereinafter Barbu) in view of Lee et al. (US-20140129226-A1; hereinafter Lee), Barnhill et al. (US-6789069-B1), and Bhardwaj et al. (US-20150063688-A1).
Regarding Claim 5,
Barbu, Chen, Lin, Barnhill, and Bhardwaj et al. teach the method of claim 3.
	Barbu et al. further teaches
wherein determining the sum comprises determining a weighted sum (para [0033] After k.sub.max iterations, a decision function F(x) is found as 
F ( x ) = k = 1 k max .alpha. k * h k * ( x * ) . ##EQU00003## Examiner note: weights summed in the summation function) that is weighted based on the number of data samples in the subset of the multiple data samples (para [0025] In a number of iterations from k=1 to k=k.sub.max, the training set is sampled 32 according to the weights of the training points.).
Regarding Claim 6,
Barbu, Chen, Lin, Barnhill, and Bhardwaj et al. teach the method of claim para [0035] In each iteration, sampling and weight update is performed using a shared sampling distribution. As a result, the weights for all views of a given training example are updated according to the opinion of the classifier from the lowest error view.).

Claim 7 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Murray et al. (US-20130315477-A1) in view of Barbu et al. (US-20080027887-A1), Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), and Barnhill et al. (US-6789069-B1).
Regarding Claim 7,
Barbu, Chen, Lin, and Barnhill teach the method of 
Murray et al. teaches wherein randomly selecting a subset of multiple data samples comprises randomly selecting a subset of multiple data samples based on a non-uniform distribution (para [0073] For example, in the case of visual features, the classifier component 30 may include a patch extractor, which extracts and analyzes content-related features of patches of the image 22, 24, such as shape, texture, color, or the like. The patches can be obtained by image segmentation, by applying specific interest point detectors, by considering a regular grid, or simply by random sampling of image patches.).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Murray et al. with the method of training classifiers of Barbu et al.
	Doing so would allow for improved accuracy (para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).
Regarding Claim 26,
para [0133] For all 4 feature types, the best results were achieved for a classifier learned by stochastic gradient descent with a regularization parameter (for optimizing the multivariate loss function) of 10.sup.-3. For color and SIFT features, best results were achieved using a 3.times.3 spatial pyramid and the entire image.).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method of training classifiers of Murray et al. with the method of training classifiers of Barbu et al.
	Doing so would allow for improved accuracy (para [0003] A goal of data fusion is to obtain a classifier h such that h learns from all the views available for each training point and has classification accuracy that is better than the case when only one view is available.).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Barnhill et al. (US-6789069-B1), and Ma et al. (US-20070239444-A1).
Regarding Claim 11,
Barbu, Chen, Lin, and Barnhill teach a computer implemented method for determining multiple training samples, the method comprising: 
para [0084] The features extracted from each training image 22 can be aggregated (e.g., concatenated) into a single image representation of the image which is input to the classifier (or set of binary classifiers) along with its label.); and 
Barbu does not explicitly disclose
determining for each feature value of the training sample a random value and adding the random value to that feature value to determine a modified training sample
However, Ma teaches
determining for each feature value of the training sample a random value and adding the random value to that feature value to determine a modified training sample (para [0022] For example, the processor 130 can add random noise to the feature vector to artificially extend the numeric range of the features.).
It would have been obvious to persons’ having ordinary skill in the art to combine the feature vectors of Murray et al. with the method of adding perturbation to feature vectors of Ma et al.
Doing so would allow for creating a more robust model (para [0027] Certain coefficient sets are more robust to noise, dynamic range, precision, and scaling.).

Claims 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Barnhill et al. (US-6789069-B1), and Jabo (“Machine Vision for Wood Defect Detection and Classification”).
Regarding Claim 17,
Barbu, Chen, Lin, and Barnhill teach the method of claim 16.
	Barbu, Chen, Lin, and Barnhill does not explicitly disclose
wherein determining the correlation value comprises: determining a maximum training value; and dividing the sum by the maximum training value.
However, Jabo teaches
 wherein determining the correlation value comprises: 
determining a maximum training value (pg. 14; 𝑉 ≔ max 𝑅,𝐺,𝐵); and 
dividing the sum by the maximum training value (pg. 13; 𝑟 ≔ 𝑉−𝑅 𝑉−𝑋 , 𝑔 ≔ 𝑉−𝐺 𝑉−𝑋 , 𝑏 ≔ 𝑉−𝐵 𝑉−<).
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the method training classifiers of Barbu et al. with the method of training classifiers of Jabo.
Doing so would allow for object recognition (pg. 3 Object recognition in computer science requires a translation of these features into a numerical form. These feature values will then be processed in a decision making program which is called the classifier. The main purpose of a classifier is to associate each feature sample with a class label.).
Regarding Claim 19,
Barbu, Chen, Lin, and Barnhill teach the method of claim 18. 
Barbu, Chen, Lin, and Barnhill do not explicitly disclose
	wherein determining each training value weight associated with one of the multiple training values comprises: determining a maximum training value; and 4Richard NockPreliminary Amendment
However, Jabo teaches
wherein determining each training value weight associated with one of the multiple training values comprises: 
determining a maximum training value (pg. 14; 𝑉 ≔ max 𝑅,𝐺,𝐵); and 
determining the training value weight based on a fraction of the one of the multiple training values over the maximum training value (pg. 18; The weight vector is updated according to 𝑤𝑖 𝑡+1 = 𝐷𝑡 𝑖 𝑒 −𝛼𝑡𝑕𝑡 𝑥𝑖 𝑦𝑖 𝑍𝑡 (10) Where, 𝐷𝑡 𝑖 is the weight distribution over the sample 𝑖 in the present round and 𝑍𝑡 is a normalization factor so that 𝑤𝑖 𝑡+1 will be a normalized weight distribution in next round. 𝑍𝑡 = 𝐷𝑡 𝑖 𝑒 −𝛼𝑡𝑕𝑡 𝑥𝑖 𝑦𝑖 𝑁 𝑖=1 (11)).

Claim 21, 22, 23, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Barnhill et al. (US-6789069-B1), and Beymer et al. (US-20150278707-A1).
Regarding Claim 21,
Barbu, Chen, Lin, and Barnhill teach the method of claim 20.
Barbu, Chen, Lin, and Barnhill do not explicitly disclose
wherein determining the training value weight comprises determining the training value weight based on a difference between a first value of a regularization function of a 
However, Beymer et al. teaches 
wherein determining the training value weight comprises determining the training value weight based on a difference between a first value of a regularization function of a current repetition and a second value of the regularization function of a previous repetition (para [0020] Accordingly, the cost function can be regularized using 
i = 1 N j = 1 N D ( C ( d i ) , C ( d j ) ) ( .alpha. i - .alpha. j ) 2 and ##EQU00003## p = 1 T q = 1 T D ( e ( C p ) , e ( C q ) ) ( .beta. p - .beta. q ) 2 . ##EQU00003.2##).
	It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the loss function of Barbu et al. with the regularization method of regularization.
	Doing so would allow for preventing overfitting (para [0019] In exemplary embodiments, regularization is needed to solve this under-determined linear system and prevent overfitting.).
Regarding Claim 22,
Barbu, Chen, Lin, Barnhill and Beymer et al. teach the method of claim 21. Beymer et al. further teaches wherein the regularization function depends on the multiple classifier coefficients associated with the corresponding repetition (para [0019] In addition, the behavioral similarity of each weak classifier can be identified by investigating the individual weak learner's output from the whole training data set. For example, if the two classifiers C.sub.p and C.sub.q have similar outputs, similar weights .beta..sub.p and .beta..sub.q can be assigned to them (the p,q.sup.th columns in FIG. 2).).
Regarding Claim 23,
Barbu, Chen, Lin, Barnhill, and Beymer et al. teach the method of claim 21. Barbu et al. further teaches wherein determining the training value weight comprises determining the training value weight based on an exponential function having an exponent by adding the difference to the exponent (para [0030] In step 38, update weights of the views as 
w k + 1 ( i ) = w k ( i ) Z k * .times. [ exp ( - .alpha. k * ) if h k * ( x i * ) = y i exp ( .alpha. k * ) if h k * ( x i * ) .noteq. y i ##EQU00002##).
Regarding Claim 25,
Barbu, Chen, Lin, Barnhill, and Beymer et al. teach the method of para [0026] In the red-green-blue example, for each view j, weak learners h.sup.R, h.sup.G and h.sup.B will be trained on the training sets X.sup.R=[x.sup.R.sub.1, x.sup.R.sub.2 . . . , x.sup.R.sub.N], X.sup.G=[X.sup.G.sub.1, X.sup.G.sub.2, . . . , X.sup.G.sub.N], and X.sup.B=[x.sup.B.sub.1, x.sup.B.sub.2, . . . X.sup.B.sub.N], such that X=[X.sup.RU X.sup.GU X.sup.B].).
Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Barbu et al. (US-20080027887-A1) in view of Chen et al. ("Data fusion neural network for tool condition monitoring in CNC milling machining."), Lin et al. ("Privacy-preserving outsourcing support vector machines with random transformation."), Barnhill et al. (US-Beymer et al. (US-20150278707-A1), and Narsky et al. (US-9501749-B1).
Regarding Claim 24,
Barbu, Chen, Lin, Barnhill and Beymer et al. teach the method of 
Barbu, Chen, Lin, Barnhill and Beymer et al. do not explicitly disclose
 wherein the regularization function comprises one or more of:  
	ridge function;
lasso function; 
L., -regularization; and 
SLOPE regularisation.
However, Narsky teaches
wherein the regularization function comprises one or more of:  
	ridge function;
lasso function (Col. 14 lines 35-37; In some implementations, the framework may provide pruning for regression ensembles by a lasso technique.); 
L., -regularization; and 
SLOPE regularisation.
It would have been obvious to persons’ having ordinary skill in the art before the effective filing date to combine the regularization function of Beymer et al. with the lasso technique of Narsky.
Doing so would allow for reducing ensemble size (Col. 14 lines 46-49; The framework may execute the shrink method to reduce the ensemble size by removing learners with optimized weights below a certain threshold.).
Burstein et al. (US-20050143971-A1) – discloses a method for training a classifier using feature vectors with corresponding labels. 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HENRY NGUYEN/Examiner, Art Unit 2121                                                                                                                                                                                                        

/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121