DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 2/13/2022.
Amended claims filed 3/8/2022
Applicant arguments/remarks made in amendment filed 2/13/2022.
Applicant arguments/remarks made in amendment filed 3/8/2022

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submissions filed on 2/13/2022 and 3/8/2022 have been entered.
 
Claims 1-2, 4, 6-8, 12, 14-15, 17-18, and 20 are amended.
Claims 1-20 are pending.
Claim 4 has been amended to remove reference to the terms “genus” and “species” which were not expressly used in the specification.  As a result, rejection to claim 4 under 35 U.S.C. § 112(a) is withdrawn.

Response to Arguments
Applicant’s arguments filed 2/13/2022 and 3/8/2022 have been fully considered but are not persuasive.
Applicant’s arguments are directed to limitations in amended independent claim 1, and by extension independent claims 7 and 14; and amended claims 4-6, 12, and 16-20. Each is addressed.
Applicant argues that the prior art of record does not “mention or suggest a same text query that has different labels” of amended claim 1.  (Remarks, page 9, paragraph 1.)  Argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant argues that “none of the prior art teaches or suggests comparing two training data sets in order to identify a conflict in their labels.” (Remarks, page 9, paragraph 2.)  However, claim 1 does not mention data sets.  Claim 1 cites “a conflict between a first training data and a second training data…wherein the first training data and the second training data are a same text query.”  Based on context, Examiner is interpreting as a data item, not a data set.  Therefore, the rejection is proper and maintained. 
Applicant argues that the prior art “does not teach or suggest two instances of a same text query having different labels, as presently claimed.” (Remarks, page 9, paragraph 3.) The argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant argues that the prior art “does not teach or suggest “wherein the conflict is a result of vagueness in a single label, and wherein the single label describes different types of queries, and wherein each of the different types of queries need to have unique labels in order to train the machine learning system”” of amended claim 4. (Remarks, page 10, paragraph 2.) The argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant argues that the prior art “does not contain, teach, or suggest that training data is generated from a document.” Guan teaches this (Guan, page 349, column 1, paragraph 3, line 1 “Existing MF, DF, and our proposed MFAUD, CFAUD are tested on the benchmark data sets from the Machine Learning Database Repository [13].  Information of these data sets is tabulated in Table 1. (Table 1 lists text-based datasets.) These data sets are collected from different real-world applications in various domains.”  Examiner notes the Machine Learning Database repository [13] is the UCI KDD archive (http://kdd.ics.uci.edu).  For text data, the UCI KDD drive uses the Reuters-21578 Text Categorization Collection, among others.  (Reuters-21578 Text Categorization Collection, description, page 1, paragraph 6  “The documents in the Reuters-21578 collection appeared on the Reuters newswire in 1987.  The documents were assembled and indexed with categories by personnel from Reuters Ltd.” (Supporting documents for the UCI KDD archive Reuters collection are included in this office action, but can be found at the listed URL above.) In other words, documents are training data generated from documents.  Examiner further notes, that the term “document” is ambiguous as it relates to electronic data.  Since the text data must necessarily be input into a machine learning model, the text data must be extracted from its source.  It is effectively irrelevant to the machine learning model what the source of the text was.  In the absence of a described distinguishing characteristic that would materially affect the training of the machine learning model, Examiner applies broadest reasonable interpretation and is interpreting ‘document’ to include all sources of electronic text (documents, lists, compilations, etc.).)  Therefore, the rejection is proper and maintained.
Applicant argues that the prior art “does not teach or suggest the feature support by paragraph [0060] of the present specification of “wherein the text query is about a service provided by an enterprise, wherein a label describing the first training data is relevant to the text query, and wherein a label describing the second training data is irrelevant to the text query”.” (Remarks, page 11, paragraph 2.)  The argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant argues that the prior art does not teach or suggest “wherein the different labels are generated by deep neural networks...” as recited in claim 16. (Remarks, page 11, paragraph 6.) However, Guan teaches “wherein the different labels are generated by deep neural networks” (Guan, page, 352, column 1, paragraph 1, line 1 “On the other hand, we also select three more sophisticated classification algorithms, including support vector machines [14], Multilayer Perception (sic), and KStar [15].” And page 356, column 1, paragraph 2, line 3 “The first step of our methods is to obtain the initial classifier based on the training data and then predict the labels for some “confident” unlabeled data.  The new labeled data are then used to aid the noise detection in the training data.”  In other words, a Multilayer Perceptron is a deep neural network, and the new labeled data is generated new labels.) Therefore, the rejection is proper and maintained.
Applicant argues that the prior art fails to teach claim 18.  In particular, Applicant argues that “Bruzzone uses context to label images, not text.”  (Remarks, page 12, paragraph 5.) However, the claim does not cite text. 
18. (currently amended)The method of claim 1, further 	comprising:
identifying a first context of the first 		training data and a second context of the second 	training data;
		determining that the first context and the 	second context are different; and
		in response to determining that the first 	context and the second are different, determining that 	there [[the]] is a conflict between labels for the first 	training data and the second training data.

In addition, Fig. 3 of the drawings of the instant application shows images that are labeled. The specification cites “FIG. 3 depicts exemplary training data/pictures, which is used to train a machine learning system, but which are labeled by either an incorrect label and/or a vague label.” (Specification, paragraph [0006].) Therefore, the rejection is proper and maintained.
Applicant argues that the prior art fails to teach amended claim 19. (Remarks, page 13, paragraph 3.) The argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant argues that the prior art fails to teach amended claim 20.  (3/8/2022- Remarks, page 7, paragraph 6.) The argument is moot in view of new grounds of rejection.  See detailed rejection below.
Applicant traverses the rejection of dependent claims 2-3, 8-10, 13, and 15 based on arguments that the independent claims from which they depend are allowable. (Remarks, page 14, paragraph 4.) Argument is moot based on the rejection of the independent claims from which they depend.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-11, and 14-16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Guan et al (Identifying mislabeled training data with the aid of unlabeled data, herein Guan), Hernandez et al (A Simple Model for Classifying Web Queries by User Intent, herein Hernandez), and Sun et al (Identifying and Correcting Mislabeled Training Instances, herein Sun).
Regarding claim 1,
	Guan teaches a method comprising:
	discovering, by a conflict detection system, a conflict between a first training data and a second training data for a machine learning system, wherein the first training data and the second training data are a same [text query], and wherein the first training data and the second training data have different labels that describe the same [text query]; (Guan, page 347, column 1, paragraph 2, line 1 “Shown in Fig. 1, majority filtering begins with n equal-sized disjoint subsets of the training set E (step 1) and the empty output set A of detected noisy examples (step 2).  The main loop (steps 3-16) is repeated for each training subset Ei.  In step 4, subset Et is formed which includes all examples from E except those in Ei, which then is used as the input an arbitrary inductive learning algorithm that induces a hypothesis (a classifier) Hj (step 6).  Those examples from Ei for which majority of the hypotheses does not give the correct classification are added to A as potentially noisy examples (step 14).” In other words, the algorithm is a conflict detection system which detects mislabeled items, and examples are a first training data and a second training data.)

    PNG
    media_image1.png
    603
    696
    media_image1.png
    Greyscale

Thus far, Guan does not explicitly teach using text query data.
Hernandez teaches text query data. (Hernandez, page 2, paragraph 4, line 1 “Our proposal consists in automatically classifying queries using only the text included in the query.”  In other words, classifying queries using only the text included in the query is text query data.)
Both Hernandez and Guan are directed to classifying data among, other things.  Guan teaches identifying mislabeled training data with the help of unlabeled data using machine learning but does not explicitly teach using text queries in the training dataset. Hernandez teaches using machine learning to validate classification of text query training data.  In view of the teaching of Guan it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Hernandez into Guan.  This would result in being able to identify mislabeled training data using machine learning where the training data is derived from text query data.
One of ordinary skill in the art would be motivated to do this because of the large number of people submitting text queries to Web Search Engines. (Hernandez, page 1, paragraph 2, line 1 “Web Search Engines (WSEs) are the most popular tools for access to Internet that people use.  According to [5], nearly 70%  of the people use a WSE for access to the Web.”)
Thus far, the combination of Guan and Hernandez does not explicitly teach in response to discovering the conflict between the first training data and the second training data for the machine learning system based on different labels, adjusting the different labels of the first training data and the second training data; and training the machine learning system using the first training data and the second training data with the adjusted labels.  
Sun teaches in response to discovering the conflict between the first training data and the second training data for the machine learning system based on different labels, adjusting the different labels of the first training data and the second training data; and (Sun, page 1, column 2, paragraph 2, line 31 “Simply eliminating all the instances suspected noise will determinately lost (sic) much useful information in them, thus correcting these instances should be the first choice.  This paper suggests a new approach which uses a Bayesian classifier to identify and correct mislabeled instances.” And, page 1, column 1, paragraph 2, line 6 “There are two possible sources of the class noise [17]: (a) contradictory examples, i.e. the examples which have the same values of all attributes except for the class label; (b) misclassification: instances labeled with wrong classes.”  And, page 2, column 1, paragraph 1, line 7 “Generally, the most challenging task in identifying noise is how to distinguish the mislabeled errors from the exceptions to general rules.  As the discussion above a Bayesian classifier can be used to address this problem.  A Bayesian classifier trained over part of the training data is used to classify the other instances. The probability distribution of the classified instance belonging to each class label can be used to identify noise error and exception error.  The instance who’s original label is different from the predicted label, and the probability of which belonging to the predicted label equals 1 or very close to 1, is identified as noise and can be corrected using the predicted label. The instance with almost equal values in the probability distribution should be identified as an exception.” And, page 2, column 2, paragraph 4, line 1 “The procedures of our proposed approach are given in Algorithm 1 and Algorithm 2.  As shown in Algorithm 1, we first call the procedure evaluate() in Algorithm 2 to calculate the entropy of each instance in training data set S.” In other words, contradictory examples, i.e. the examples which have the same values of all attributes except for the class label is discovering the conflict…based on different labels, and can be corrected using the predicted label is adjusting the different labels of the first training data and the second training data. See algorithms 1 and 2.)

    PNG
    media_image2.png
    430
    552
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    399
    548
    media_image3.png
    Greyscale

Sun also teaches training the machine learning system using the first training data and the second training data with the adjusted labels.  (Sun, page 1, column 1, paragraph 1, line 10 “This paper focus (sic) on improving the quality of the training data by identifying and correcting mislabeled instances prior to apply (sic) the learning algorithm, thereby increasing prediction accuracy.” In other words, learning algorithm is machine learning system, correcting mislabeled instances prior to applying the learning algorithm is using the first training data and the second training data with the adjusted labels.)
Both Sun and the combination of Guan and Hernandez are directed to classifying data and detecting errors in labeled training data for machine learning, among other things.  The combination of Guan and Hernandez teaches identifying mislabeled training data with the aide of unlabeled data including text queries with machine learning, but does not teach discovering a conflict between two training sets based on mislabeled data and correcting the mislabeling. Sun teaches discovery of conflicts based on mislabeled training data instances using machine learning and correcting the mislabeling. In view of the teaching of the combination of Guan and Hernandez it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Sun into the combination of Guan and Hernandez.  This would result in being able to correct mislabeled training data after detecting that there is a conflict.
One of ordinary skill in the art would be motivated to do this because it is important in providing a generalization from a set of training instances that the data is accurate by correcting mislabeled items. (Sun, page 1, column 1, paragraph 1, “In order to form a good generalization from a set of training instances, a clean training dataset is important.  Unfortunately, real world data is never as perfect as we would like it to be and can often suffer from corruptions. In this paper, a new approach is proposed to identify and correct mislabeled training instances.”) 
Regarding claim 2,
the combination of Guan, Hernandez, and Sun teaches the method of claim 1,further comprising:
	discovering the conflict by: generating unsupervised clustering similar training data to create a training data cluster; and (Guan, Algorithm 3, and page 347, column 1, paragraph 4, line 1 “Both majority filtering and consensus filtering employ multiple classifiers to detect the noisy instances through n-cross-validation.  In cross i, subset i is extracted and checked. The combination of other subsets is used as training data to construct a set of classifiers based on the learning algorithms, which further classify the instances in subset i to detect the noises.  The reliability of these classifiers therefore is crucial and the noise detection performance is expected to improve when the classification accuracies of these classifiers are increased.  Our approach is to utilize the unlabeled data to increase the classification accuracies of the classifiers.” And, page 347, column 1, paragraph 4, line 1 “Both majority filtering and consensus filtering employ multiple classifiers to detect the noisy instances through n-cross-validation.  In cross i, subset i is extracted and checked. The combination of other subsets is used as training data to construct a set of classifiers based on the learning algorithms, which further classify the instances in subset i to detect the noises.”

    PNG
    media_image4.png
    551
    452
    media_image4.png
    Greyscale

In other words, subset is cluster, and utilize unlabeled data is generate unsupervised clustering. See Algorithm 3.)
	performing cross validation of the training data cluster in order to filter out training data that creates a false positive from the artificial intelligence.  (Guan, page 347, column 1, paragraph 4, line 1 “Both majority filtering and consensus filtering employ multiple classifiers to detect the noisy instances through n-cross-validation.  In cross i, subset i is extracted and checked. The combination of other subsets is used as training data to construct a set of classifiers based on the learning algorithms, which further classify the instances in subset i to detect the noises.  The reliability of these classifiers therefore is crucial and the noise detection performance is expected to improve when the classification accuracies of these classifiers are increased.  Our approach is to utilize the unlabeled data to increase the classification accuracies of the classifiers.” In other words, n-cross-validation is cross validation.)
Regarding claim 3,
the combination of Guan, Hernandez, and Sun teaches the method of claim 1,wherein 
the conflict is a result of human error by human labelers when labeling the first training data and the second training data.  (Guan teaches detection of mislabeled data regardless of source, therefore Guan teaches detection of mislabeled data resulting from human error.  See mapping of claim 1 above.)
Regarding claim 4,
	the combination of Guan, Hernandez, and Sun teaches the method of claim 1,wherein 
	the conflict is a result of a vagueness in a single label,  and wherein the single label describes different types of queries, and wherein each of the different types of queries need to have unique labels in order to train the machine learning system. (Sun, page 2, column 1, paragraph 1, line 7 “Generally, the most challenging task in identifying noise is how to distinguish the mislabeled errors from the exceptions to general rules.  As in the discussion above, a Bayesian classifier can be used to address this problem.  A Bayesian classifier trained over part of the training data is used to classify the other instances. The probability distribution of the classified instance belonging to each class label can be used to identify noise error and exception error.  The instance who’s original label is different from the predicted label, and the probability of which belonging to the predicted label equals 1 or very close to 1, is identified as noise and can be corrected using the predicted label. The instance with almost equal values in the probability distribution should be identified as an exception.” In other words, the instance with almost equal values is the result of a vagueness in one or more of the different labels )
Regarding claim 5,
the combination of Guan, Hernandez, and Sun teaches the method of claim 1, wherein 
the machine learning system is a deep neural network, and wherein the first training data and the second training data for the machine learning system are generated from a data document. (Guan, See mapping of claim 1, and, page, 352, column 1, paragraph 1, line 1 “On the other hand, we also select three more sophisticated classification algorithms, including support vector machines [14], Multilayer Perception (sic), and KStar [15].” And, page 1, column 1, paragraph 1, line 1 “This paper presents a new approach for identifying and eliminating mislabeled training instances for supervised learning algorithms.” And, page 349, column 1, paragraph 3, line 1 “Existing MF, DF, and our proposed MFAUD, CFAUD are tested on the benchmark data sets from the Machine Learning Database Repository [13].  Information of these data sets is tabulated in Table 1. These data sets are collected from different real-world applications in various domains.” In other words, multiplayer perceptron is deep neural network, and Machine Learning Database Repository [13] is electronic documents, among other things. See Response to Arguments, paragraph 8, subparagraph e above.)
Claims 7-11 are computer program product claims corresponding to method claims 1-5, respectively.  Otherwise, they are the same.  It is implicit that a computer implemented method requires a processor and memory and is able to create and read computer program products.  Therefore, claims 7-11 are rejected for the same reasons as claims 1-5, respectively. 
Claims 14 and 15 are computer system claims corresponding to method claims 1 and 2, respectively.  Outside of that, they are the same.  It is implicit that a computer implemented method requires a computer system with a processor and memory.  Therefore, claims 14 and 15 are rejected for the same reasons as claims 1 and 2, respectively.
Regarding claim 16,
	the combination of Guan, Hernandez, and Sun teaches the method of claim 1, wherein 
	the different labels are generated by deep neural networks. (Guan, page 345, column 1, paragraph 1, line 1 “This paper presents a new approach for identifying and eliminating mislabeled training instances for supervised learning algorithms.” And, page 352, column 1, paragraph 1, line 1 “On the other hand, we also select three more sophisticated classification algorithms, including support vector machines [14], Multilayer Perception (sic), and, KStar [15].  In other words, classification is generating labels and Multilayer Perceptrons are deep neural networks.)
Regarding claim 19,
	The combination of Guan, Hernandez, and Sun teaches the method of claim 1, further comprising:
	determining that a first label for the first training data is incorrect; determining that a second label for the second training data is correct; in response to determining that the first label is incorrect and that the second label is correct, (Sun, page 1, column 1, paragraph 2, line 10 “This paper focus on improving the quality of the training data by identifying and correcting mislabeled instance prior to apply the learning algorithm, thereby increasing prediction accuracy.” And page 1, column 2, paragraph 3, line 1 “When we use a Bayesian classifier to classify an instance, we can get a probability distribution that is the probability of the instance belonging to each class label. When one of the probability equals 1 or very close to 1, and others equal 0 or close to 0, which means that this instance is a typical one of the class label where the probability equals 1 or very close to 1.  In contrast when all of the probabilities almost equal, the instance should not be a typical one of any class label under considered.” In other words, when all the probabilities are almost equal is determining the first label is incorrect, and when the probability of the instance is 1 or close to one and others are equal to 0 or close to 0 is determining that a second label is correct.) 
	presenting multiple examples of the second training data to an oracle to adjust the first training data. (Sun, Algorithm 2 - Evaluate, and page 2, column 2, paragraph 4, line 2 “As shown in Algorithm 1, we first call the procedure evaluate () in Algorithm 2 to calculate the entropy of each instance in training data set S.” and, page 3, column 1, paragraph 1, line 1 “Fixed Value: The system automatically assigns a value for 
    PNG
    media_image5.png
    13
    17
    media_image5.png
    Greyscale
 , where possible values for 
    PNG
    media_image5.png
    13
    17
    media_image5.png
    Greyscale
 are specified experiences or empirical results.  Because the users should know the noise level in data set if the first scheme is employed, the second is used in this work. If Ek is lower than T and the original label of Ik is different from the prediction, it will be tagged as mislabeled instance.  Then we replace the class label of mislabeled instance with the prediction.”  In other words, Algorithm 2 with the procedure evaluate (), is the oracle which is used to adjust the label of the first training data.)
Claims 6, 12, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Guan, Hernandez, Sun, and Patil et al (Analysis of Banking Data Using Machine Learning, herein Patil).

Regarding claim 6,
	the combination of Guan, Hernandez, and Sun teaches the method of claim 1,wherein 
	the text query is [about a service provided by an enterprise], wherein a label describing the first training data is relevant to the text query, and wherein a label describing the second training data is irrelevant to the text query. (Sun, page 1, column 1, paragraph 2, line 6 “There are two possible sources of the class noise [17]: (a) contradictory examples, i.e. the examples which have the same values of all attributes except for the class label; (b) misclassification: instances labeled with wrong classes.” In other words, contradictory examples is label describing the first training data is relevant and the label describing the second training data is irrelevant.  “text query” was previously mapped.  See mapping of claim 1.) 
	Thus far, the combination of Guan, Hernandez, and Sun does not explicitly teach the data is about a service provided by an enterprise.
	Patil teaches the data is about a service provided by an enterprise (Patil, page 1, column 1, paragraph 2, line 2 “It contains customer account information, transaction information, all financial data etc. Data analytics can be used to analyze large volume data to extract meaningful information, hidden patterns and to discover knowledge from the large volume data [12].  Banks are facing various challenges like customer retention, fraud detection, risk management [2] and customer segmentation.” And, page 879, column 1, paragraph 2, line 1 “This dataset is used for customer retention problem.  This data is prepared under the guidance of bank.  It contains bank customer’s information such as customer id, age, gender, balance, income, credit card status, marital status, loan type, account type, number of transaction he makes, education and job of customer.” In other words, credit card status, loan type, and transactions are data that is about a service provided by an enterprise. Examiner notes the claimed invention is directed to discovering and resolving training data conflicts in machine learning systems.  It effectively doesn’t matter whether the data comes from the financial industry or some other industry because the methods in the claimed invention that are  used for discovering and resolving training conflicts in the data can be used regardless of the source of the data.)
	Both Patil and the combination of Guan, Hernandez, and Sun are directed to classifying data, among other things.  The combination of Guan, Hernandez, and Sun are directed to identifying and resolving training conflicts in training data but does not explicitly teach the training data is about a service provided by an enterprise.  Patil teaches training data is about a service provided by an enterprise.  In view of the teaching of the combination of Guan, Hernandez, and Sun, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Patil into the combination of Guan, Hernandez, and Sun.  This would result in being able to identify and resolve conflicts in training data that is about services provided by an enterprise.
	One of ordinary skill in the art would be motivated to do this because there is a large amount of important data associated with the banking industry which can be used to extract meaningful information. (Patil, page 876, column 1, paragraph 2, line 1 “The Banking industry generates a massive volume of data every day.  It contains customer account information, transaction information, all financial data etc.  Data analytics can be used to analyze large volume data to extract meaningful information from it [7].”)
Claim 12 is a computer program product claim corresponding to method claim 6.  Otherwise, they are the same.  Therefore, claim 12 is rejected for the same reasons as claim 6.
Claim 17 is a computer system claim corresponding to method claim 6.  Otherwise, they are the same.  Therefore, claim 17 is rejected for the same reasons as claim 6.
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Guan, Hernandez, Sun, and Schikuta et al (A Cloud-Based Neural Network Simulation Environment, herein Schikuta).
Regarding Claim 13 
	The combination of Guan, Hernandez, and Sun teaches the computer program product of claim 7,
	Thus far, the combination of Guan, Hernandez, and Sun does not explicitly teach wherein the program code is provided as a service in a cloud environment. 
	Schikuta teaches wherein the program code is provided as a service in a cloud environment. (Schikuta, page 1, paragraph 1, line 1 “We present N2Sky, a novel Cloud-based neural network simulation environment.”  In other words, a Cloud-based neural network simulation environment is program code provided as a service in a cloud environment.)
	Both Schikuta and the combination of Guan, Hernandez, and Sun are directed to the implementation of machine learning for practical applications generally, and the improvement of machine learning, among other things.  The combination of Guan Hernandez, and Sun teaches discovering and resolving training data conflicts in machine learning systems, but does not specifically teach providing the program code as a service in a cloud environment.  Schikuta teaches providing the machine learning code as a service in a cloud environment.  In view of the teaching of Guan, Hernandez, and Sun it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Schikuta into the combination of Guan, Hernandez, and Sun.  This would result in being able to provide detecting and resolving training data conflicts for machine learning systems in a cloud environment.
One of ordinary skill in the art would be motivated to do this in order to provide the ability to perform neural network simulation in a cloud-based environment, thus saving the cost of systems and development. (Schikuta, page 1, paragraph 1, line 2 “The system implements a transparent environment aiming to enable arbitrary and experienced users to do neural network simulations easily and comfortably.  The necessary resources, as CPU-cycles, storage space, etc. are provide by using Cloud infrastructure.”)
Claims 18 is rejected under 35 U.S.C. 103 as being unpatentable over Guan, Hernandez, Sun, and Bruzzone et al (A Novel Context-Sensitive Semisupervised SVM Classifier Robust to Mislabeled Training Samples, herein Bruzzone).
Regarding claim 18,
	The combination of Guan, Hernandez, and Sun teaches the method of claim 1, further comprising:
	Thus far, the combination of Guan, Hernandez, and Sun does not explicitly teach identifying a first context of the first training data and a second context of the second training data; determining that the first context and the second context are different; and in response to determining that the first context and the second are different, determining that there is a conflict between labels for the first training data and the second training data.
	Bruzzone teaches identifying a first context of the first training data and a second context of the second training data; determining that the first context and the second context are different; and in response to determining that the first context and the second are different, determining that there is a conflict between labels for the first training data and the second training data. (Bruzzone, page 2143, column 2, paragraph 2, line 1 “The idea behind the proposed methodology is to exploit information of the context patterns X to reduce the bias effect of the mislabeled training patterns on the definition of the discriminating hyperplane of the SVM classifier, thus decreasing the sensitivity of the learning algorithm to unreliable training samples.” And column 2, paragraph 3, line 14 “This strategy is defined according to a learning procedure for the proposed CS4VM that is based on two main steps: 1) supervised learning with original training samples and classification of the (unlabeled) context patterns and 2) contextual semisupervised learning based on both original labeled patterns and semilabeled context patterns according to a novel cost function.” In other words, unlabeled context patterns is context of the first training data, semilabeled context patterns is context of the second training data, and mislabeled training patterns is a conflict between labels.)
	Both Bruzzone and the combination of Guan, Hernandez, and Sun are directed to identifying mislabeled training data, among other things. In view of the teaching of the combination of Guan, Hernandez, and Sun, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Bruzzone into the combination of Guan, Hernandez, and Sun.  This would result in being able to identify mislabeled training data with the addition of using context.
	One of ordinary skill in the art would be motivated to do this to improve classification when the training set is not reliable. (Bruzzone, page 2142, column 1, paragraph 1, line 1 “This paper presents a novel context-sensitive semi-supervised support vector machine (CS4VM) classifier, which is aimed at addressing classification problems where the available training set is not fully reliable, i.e., some labeled samples may be associated to the wrong information class (mislabeled patterns).”) 
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Guan, Hernandez, Sun, Bruzzone, and Patil.
Regarding Claim 20,
	The combination of Guan, Hernandez, Sun, Bruzzone and Patil teaches the method of claim 18, wherein
	the first context describes a financial service provided by a financial institution, and wherein the second context describes a security feature provided by the financial institution. (Patil, page 1, column 1, paragraph 2, line 2 “It contains customer account information, transaction information, all financial data etc. Data analytics can be used to analyze large volume data to extract meaningful information, hidden patterns and to discover knowledge from the large volume data [12].  Banks are facing various challenges like customer retention, fraud detection, risk management [2] and customer segmentation.” In other words, transaction information is first context describes a financial service, and fraud detection is second context describes a security feature provided by the financial institution.)
	Both Patil and the combination of Guan, Hernandez, Sun and Bruzzone are directed to classifying data, among other things.  The combination of Guan, Hernandez, Sun and Bruzzone teaches discovering and resolving training data conflicts in machine learning systems using context, but does not explicitly teach financial service data from the financial industry.  Patil teaches classifying data that is financial service data from the financial service industry.  In view of the teaching of the combination of Guan, Hernandez, Sun, and Bruzzone, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Patil into the combination of Guan, Hernandez, Sun, and Bruzzone.  This would result in being able to discover and resolving training data conflicts in machine learning systems using context where the data is from the financial service industry.
	One of ordinary skill in the art would be motivated to do this because of the large amount of data requiring processing and analysis in the financial services industry would benefit from accurate training data. (Patil, page 876, column 1, paragraph 2, line 1 “The Banking industry generates a massive volume of data every day.  It contains customer account information, transaction information, all financial data etc.  Data analytics can be used to analyze large volume data to extract meaningful information from it [7].”)
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Bovolo, F., “A Context-Sensitive Technique Based on Support Vector Machines for Image Classification”, teaches using spatial context for classifying images with support vector machines.
Brodley, C. “Identifying Mislabeled Training Data”, teaches a  method for identifying and eliminating mislabeled training data by using a set of learning algorithms to create classifiers that serve as noise filters.
Grotton, T., “A Comparison of Language Identification Approaches on Short, Query-Style Texts”, teaches classifying text queries.
Hagen, M., “What was the Query? Generating Queries for Document Sets with Application in Cluster Labeling”, teaches using queries to retrieve a given set of documents using cluster labeling.  
Kotsiantis, S., “Forecasting Fraudulent Financial Statements using Data Mining”, teaches using an ensemble of classifiers to identify fraudulent financial statements.
Vajda, S., “Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: Application to handwritten character recognition”, teaches a generic, semi-automatic labeling technique for large handwritten character collections using unsupervised clustering.
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124