DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 9/9/2021.
Amended specification filed 9/9/2021.
Applicant arguments/remarks made in amendment filed 9/9/2021.
In view of amendments, objections to specification are withdrawn.
Claim objections with regard to the interpretation of “data” are withdrawn. For the purposes of examination, Examiner is interpreting “data” as singular.
Claims 4, 7, and 16-19 are amended.
Claims 1-20 are presented for examination.
Response to Arguments
Applicant’s arguments filed 8/19/2021 have been fully considered but they are not persuasive.
Applicant’s arguments beginning on page 11 are directed to limitations in amended independent claim 7, and by extension independent claims 1 and 14; and amended claims 4, and 16 -19. Each is addressed.
Prior art fails to teach a limitation of claim 7 (page 11, paragraph 2), specifically
	discovering a conflict between a first training data and a second training 	data for a machine learning system, wherein the first training data and 	the second training data are ground truths that describe a same type of 	entity, wherein ground truths are data that are known to be accurate 	when describing an entity (underlined to show the 	amendment).
	
	With regard to “ground truths,” the specification of the instant application, paragraph [0039], cites “a ground truth is data that is known to be accurate, since it is based on an observation made “on the ground” where the event/object is based.  One type of ground truth conflict involves two or more identical ground truth examples, each with a different class/label.” Examiner notes that despite being called a ground truth which is supposed to be “known to be true” the claimed invention is directed to conflict detection in which one or more “known truths” are in fact mislabeled and therefore not true.  In such an instance, the labels being described as “ground truths” are not distinguishable from other mislabeled data.  Therefore, an algorithm which detects conflicts, such as the algorithm described in Guan, will detect the instances of mislabeling in the “ground truth” data as well as the data that is not described as “ground truth.” The algorithm described in Guan, Fig. 1, will detect conflicts in the case of a particular training datum being labeled with two conflicting labels because it examines every element individually.  It also detects if both conflicting labels are incorrect.
Prior art fails to teach a limitation of amended claim 4 (page 12, paragraph 5), specifically
the method of claim 1, wherein the conflict is a result of vagueness in one or more of the different labels, and wherein the different labels describe different species of the same genus. 

Prior art fails to teach a limitation of claim 16 (page 12, paragraph 7), specifically
	wherein the different labels are generated by deep neural networks
	The combination of Guan and Sun teaches wherein the different labels are generated by deep neural networks (Guan 345, column 1, paragraph 7, line 1 “This paper presents a new approach for identifying and eliminating mislabeled training instances for supervised learning algorithms.” And, page 352, column 1, paragraph 1, line 1 “On the other hand, we also select three more sophisticated classification algorithms, including support vector machines [14], Multilayer Perception (sic), and, KStar [15].  In other words, classification is generating labels and Multilayer Perceptrons are deep neural networks.)
Prior art fails to teach amended claim 17 (page 13, paragraph 2), specifically
The method of claim 1, further comprising: discovering the conflict by: training a model of the entity from the ground truths; generating a probability vector for each label in the ground truths, wherein the probability vector is a confidence vector describing a probability that each label is accurate; clustering generated probability vectors; determining that clustered generated probability vectors contain examples spanning multiple ground truth labels; and in response to determining that clustered generated probability vectors contain the examples spanning multiple ground truth labels, determining that the different labels conflict with one another.

	The combination of Guan and Sun teaches amended claim 17. Claim 17 has been amended in its entirety.  In the interest of brevity, see the mapping of amended claim 17 in paragraph 23.
Prior art fails to teach amended claim 18 (page 14, paragraph 2), specifically
The method of claim 1, further comprising: identifying a first context of the first training data and a second context of the second training data; determining that the first context and the second context are different; and in response to determining that the first context and the second are different, determining that there the conflict between labels for the first training data and the second training data.
	
	The combination of Guan, Sun, and Bruzzone teaches the amended claim 18.  Claim 18 has been amended in its entirety.  In the interest of brevity, see the mapping of claim 18 in paragraph 26.
Prior art fails to teach amended claim 19 (page 14, paragraph 5), specifically
The method of claim 1, further comprising: determining that a first label for the first training data is incorrect; determining that a second label for the second training data is correct; in response to determining that the first label is incorrect and that the second label is correct, presenting multiple examples of the second training data to an oracle to adjust the first training data.
	
	The combination of Guan and Sun teaches amended claim 19.   Claim 19 has been amended in its entirety.  In the interest of brevity, see the mapping of claim 19 in paragraph 24.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claim 4 rejected under 35 U.S.C. 112(a), first paragraph, as failing to comply with the written description requirement. The claim contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.  In particular, claim 4 cites, the method of claim 1, wherein the conflict is a result of vagueness in one or more of the different labels, and wherein the different labels describe different species of the same genus.  There is no mention or support in the specification for “genus” or “species.” 
Further, the example given in [0048] of the specification regarding vagueness has two examples with the more general (genus) same label “cat” applied to images of different species, rather than two different labels as recited amended in the claim language.  Thus wherein the different labels describe different species of the same genus causing a conflict is not supported by the specification – the conflict described in [0048] is the same label describing two different species.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 

Claims 1-12, and 14-19 are rejected under 35 U.S.C. 103 as being unpatentable over Guan et al (Identifying mislabeled training data with the aid of unlabeled data, herein Guan), and Sun et al (Identifying and Correcting Mislabeled Training Instances, herein Sun).
Regarding claim 1,
	Guan teaches a method comprising:
	discovering, by a conflict detection system, a conflict between a first training data and a second training data for a machine learning system, wherein the first training data and the second training data are ground truths that describe a same type of entity, and wherein the first training data and the second training data have different labels; (Guan, page 347, column 1, paragraph 2, line 1 “Shown in Fig. 1, majority filtering begins with n equal-sized disjoint subsets of the training set E (step 1) and the empty output set A of detected noisy examples (step 2).  The main loop (steps 3-16) is repeated for each training subset Ei.  In step 4, subset Et is formed which includes all examples from E except those in Ei, which then is used as the input an arbitrary inductive learning algorithm that induces a hypothesis (a classifier) Hj (step 6).  Those examples from Ei for which majority of the hypotheses does not give the correct classification are added to A as potentially noisy examples (step 14).” In other words, the algorithm is a conflict detection system which detects mislabeled items.)

    PNG
    media_image1.png
    603
    696
    media_image1.png
    Greyscale

Thus far, Guan does not explicitly teach in response to discovering the conflict between the first training data and the second training data for the machine learning system, adjusting the different labels of the first training data and the second training data; and
Sun teaches in response to discovering the conflict between the first training data and the second training data for the machine learning system, adjusting the different labels of the first training data and the second training data; and (Sun, page 1, column 2, paragraph 2, line 31 “Simply eliminating all the instances suspected noise will determinately lost (sic) much useful information in them, thus correcting these instances should be the first choice.  This paper suggests a new approach which uses a Bayesian classifier to identify and correct mislabeled instances.” And, page 2, column 1, paragraph 1, line 7 “Generally, the most challenging task in identifying noise is how to distinguish the mislabeled errors from the exceptions to general rules.  As the discussion above a Bayesian classifier can be used to address this problem.  A Bayesian classifier trained over part of the training data is used to In other words, can be corrected using the predicted label is adjusting the different labels of the first training data and the second training data. See algorithms 1 and 2.)

    PNG
    media_image2.png
    710
    474
    media_image2.png
    Greyscale


One of ordinary skill in the art would be motivated to do this because it is important in providing a generalization from a set of training instances that the data is accurate by correcting mislabeled items. (Sun, page 1, column 1, paragraph 1, “In order to form a good generalization from a set of training instances, a clean training dataset is important.  Unfortunately, real world data is never as perfect as we would like it to be and can often suffer from corruptions. In this paper, a new approach is proposed to identify and correct mislabeled training instances.”) 
training the machine learning system using the first training data and the second training data with the adjusted labels.  (Sun, page 1, column 1, paragraph 1, line 10 “This paper focus (sic) on improving the quality of the training data by identifying and correcting mislabeled instances prior to apply (sic) the learning algorithm, thereby increasing prediction accuracy.” In other words, learning algorithm is machine learning system, correcting mislabeled instances prior to applying the learning algorithm is using the first training data and the second training data with the adjusted labels.)
Regarding claim 2,
	The combination of Guan and Sun teaches the method of claim 1,
further comprising: discovering the conflict by: generating unsupervised clustering similar ground truth training data to create a training data cluster; and (Guan, page 347, column 1, paragraph 4, line 1 “Both majority filtering and consensus filtering employ multiple classifiers to detect the noisy instances through n-cross-validation.  In cross i, subset i is extracted and checked. The combination of other subsets is used as training data to construct a set of classifiers based on the learning algorithms, which further classify the instances in subset i to detect the noises.  The reliability of these classifiers therefore is crucial and the noise detection performance is expected to improve when the classification accuracies of these classifiers are increased.  Our approach is to utilize the unlabeled data to increase the classification accuracies of the classifiers.” In other words, utilize unlabeled data is generate unsupervised clustering.  See Algorithm 3.)

    PNG
    media_image3.png
    551
    452
    media_image3.png
    Greyscale

performing cross validation of the training data cluster in order to filter out training data that creates a false positive from the artificial intelligence.  (Guan, page 347, column 1, paragraph 4, line 1 “Both majority filtering and consensus filtering employ multiple classifiers to detect the noisy instances through n-cross-validation.  In cross i, subset i is extracted and checked. The combination of other subsets is used as training data to construct a set of classifiers based on the learning algorithms, which further classify the instances in subset i to detect the noises.  The reliability of these classifiers therefore is crucial and the noise detection performance is expected to improve when the classification accuracies of these classifiers are increased.  Our approach is to utilize the unlabeled data to increase the classification accuracies of the classifiers.” In other words, n-cross-validation is cross validation.)
Regarding claim 3,
The combination of Guan and Sun teaches the method of claim 1,
wherein the conflict is a result of human error by human labelers when labeling the first training data and the second training data.  (Guan teaches detection of mislabeled data.  See paragraph 11 above.  Further, Guan teaches detection of mislabeled data regardless of the source of the mislabeling.)
Regarding claim 4,
	The combination of Guan and Sun teaches the method of claim 1,
	wherein the conflict is a result of a vagueness in one or more of the different labels,  and wherein the different labels describe different species from a same genus. (Sun, page 2, column 1, paragraph 1, line 7 “Generally, the most challenging task in identifying noise is how to distinguish the mislabeled errors from the exceptions to general rules.  As in the discussion In other words, the instance with almost equal values is the result of a vagueness in one or more of the different labels )
Regarding claim 5,
The combination of Guan and Sun teaches the method of claim 1,
	wherein the machine learning system is a deep neural network, and wherein the first training data and the second training data for the machine learning system are generated from a data document. (Guan, see paragraph 11, Algorithm 1 MajorityFiltering above. It is pseudocode and does not distinguish whether the data items are words from documents or images.  It can be used in either case.  Further, the algorithm does not distinguish between the types of machine learning systems that will use the data for training.)
Regarding claim 6,
	The combination of Guan and Sun teaches the method of claim 1,
wherein the machine learning system is a convolutional neural network, and wherein the first training data and the second training data for the machine learning system are generated from photographs. (Guan, see paragraph 11, Algorithm 1 MajorityFiltering above. It is pseudocode and does not distinguish whether the data items are words from documents or images.  It can be used in either case.  Further, the algorithm does not distinguish between the types of machine learning systems that will use the data for training.) 
Claim 7 is a computer program product claim corresponding to method claim 1.  In addition, claim 7 adds the phrase “wherein ground truths are data that are known to be accurate when describing an entity” to clarify the previous limitation “wherein the first training data and the second training data are ground truths that describe a same type of entity.”  Otherwise, they are the same.  It is implicit that a computer implemented method requires a processor and memory and is able to create and read computer program products.  Therefore, claim 7 is rejected for the same reasons as claim 1. See the mapping of claim 1 in paragraph 11.
Claims 8-12 are computer program product claims corresponding to method claims 2-6, respectively.  Outside of that, they are the same.  Therefore, claims 7-12 are rejected for the same reasons as claims 1-6 respectively.
Claims 14 and 15 are computer system claims corresponding to method claims 1 and 2, respectively.  Outside of that, they are the same.  It is implicit that a computer implemented method requires a computer system with a processor and memory.  Therefore, claims 14 and 15 are rejected for the same reasons as claims 1 and 2, respectively.
Regarding claim 16,
	The combination of Guan and Sun teaches the method of claim 1,
wherein the different labels are generated by deep neural networks. (Guan, page 345, column 1, paragraph 1, line 1 “This paper presents a new approach for identifying and eliminating mislabeled training instances for supervised learning algorithms.” And, page 352, column 1, paragraph 1, line 1 “On the other hand, we also select three more sophisticated classification algorithms, including support vector machines [14], Multilayer Perception (sic), and, KStar [15].  In other words, classification is generating labels and Multilayer Perceptrons are deep neural networks.)
Regarding claim 17, 
	The combination of Guan and Sun further teaches the method of claim 1,
	further comprising: discovering the conflict by: training a model of the entity from the ground truths; (Guan, page 345, column 2, paragraph 1, line 1 “The goal of an inductive learning algorithm is to form a good generalization model constructed on training instances.” And, page 345, column 1, paragraph 1, line 3 “The novelty of this approach lies in the using of unlabeled instances to aid the detection of mislabeled training instances. This is in contrast with existing methods which rely upon only the labeled training instances.” In other words, constructed on training instances is training, and labeled training instances are ground truths.  As described in the mapping of claim 1, paragraph 13, the concept of ground truths that are verified to be true, but then turn out to be mislabeled does not distinguish from labeled data, not identified as ground truths, that may be mislabeled with regard to an algorithm that is designed to detect mislabeled data.  The Guan algorithm will detect the mislabeled data whether the original data is generated as a ground truth or whether it isn’t.)
generating a probability vector for each label in the ground truths, wherein the probability vector is a confidence vector describing a probability that each label is accurate; (Sun, page 1, column 2, paragraph 3, line 1 “When we use a Bayesian classifier to classify an instance, we can get a probability distribution that is the probability of the instance belonging to each class label.”  In other words, probability distribution is probability vector and probability of the instance belonging to each class label is probability that each label is accurate.)
	clustering generated probability vectors; determining that clustered generated probability vectors contain examples spanning multiple ground truth labels; and (Sun, page 2, column 2, paragraph 4, line 4 “To calculate the entropy E, as shown in Algorithm 2, we first randomize the sequence of instances in S and partition S in N subsets.  Given any subset Pi, a Bayesian classifier Ci learned over S\Pi (the complementary subset of Pi) is used to evaluate the typicality of the instances in Pi.  For each instance Ik, we will get the probability distribution Dk of Ik belonging to class labels under considered, then from Dk we can calculate the entropy Ek by Equation 1.” And, page 2, column 1, paragraph 1, line 1 “When one of the probability equals 1 or very close to 1, and others equal 0 or close to 0, which means that this instance is a typical one of the class label where the probability equals 1 or very close to 1.  In contrast, when all of the probabilities are almost equal, the instance should not be a typical one of any class label under considered (sic).” In other words, n subsets are clustered probability vectors. And each subset n contains examples spanning multiple ground truths. See Algorithm 2.) 

    PNG
    media_image4.png
    450
    624
    media_image4.png
    Greyscale

	in response to determining that clustered generated probability vectors contain the examples spanning multiple ground truth labels, determining that the different labels conflict with one another. (Sun, page 2, column 1, paragraph 1, line 13 “The probability distribution of the classified instance belonging to each class label can be used to identify noise error and exception error.  The instance whose original label is different from the predict label, and the probability of which belonging to the predict label equals 1 or very close to 1, is identified as noise and can be corrected using the predict label.” In other words, original label is different from the predict label is determining that the different labels conflict with one another.)
Both Guan and Sun are directed to detecting errors in labeled training data for machine learning.  In view of the teaching of Guan it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Sun into Guan.  This would result in being able to use probability distributions to identify mislabeled data.
	One of ordinary skill in the art would be motivated to do this because it is better to correct mislabeled data than simply eliminating them because it preserves useful information. 
Regarding claim 19,
	The combination of Guan and Sun teaches the method of claim 1, further comprising:
	determining that a first label for the first training data is incorrect; determining that a second label for the second training data is correct; in response to determining that the first label is incorrect and that the second label is correct, (Sun, page 1, column 1, paragraph 2, line 10 “This paper focus on improving the quality of the training data by identifying and correcting mislabeled instance prior to apply the learning algorithm, thereby increasing prediction accuracy.” And page 1, column 2, paragraph 3, line 1 “When we use a Bayesian classifier to classify an instance, we can get a probability distribution that is the probability of the instance belonging to each class label. When one of the probability equals 1 or very close to 1, and others equal 0 or close to 0, which means that this instance is a typical one of the class label where the probability equals 1 or very close to 1.  In contrast when all of the probabilities almost equal, the instance should not be a typical one of any class label under considered.” In other words, when all the probabilities are almost equal is determining the first label is incorrect, and when the probability of the instance is 1 or close to one and others are equal to 0 or close to 0 is determining that a second label is correct.) 
	presenting multiple examples of the second training data to an oracle to adjust the first training data. (Sun, Algorithm 2 Evaluate, and page 2, column 2, paragraph 4, line 2 “As S.” and, page 3, column 1, paragraph 1, line 1 “Fixed Value: The system automatically assigns a value for 
    PNG
    media_image5.png
    13
    17
    media_image5.png
    Greyscale
 , where possible values for 
    PNG
    media_image5.png
    13
    17
    media_image5.png
    Greyscale
 are specified experiences or empirical results.  Because the users should know the noise level in data set if the first scheme is employed, the second is used in this work. If Ek is lower than T and the original label of Ik is different from the prediction, it will be tagged as mislabeled instance.  Then we replace the class label of mislabeled instance with the prediction.”  In other words, Algorithm 2, with the procedure evaluate (), is the oracle which is used to adjust the label of the first training data.)
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Guan, Sun, and Bruzzone et al (A Novel Context-Sensitive Semisupervised SVM Classifier Robust to Mislabeled Training Samples, herein Bruzzone).
Regarding claim 18,
	The combination of Guan and Sun teaches the method of claim 1, further comprising:
	Thus far, the combination of Guan and Sun does not explicitly teach identifying a first context of the first training data and a second context of the second training data; determining that the first context and the second context are different; and in response to determining that the first context and the second are different, determining that there the conflict between labels for the first training data and the second training data.
	Bruzzone teaches identifying a first context of the first training data and a second context of the second training data; determining that the first context and the second context are different; and in response to determining that the first context and the second are different, determining that there the conflict between labels for the first training data and the second training data. (Bruzzone, page 2143, column 2, paragraph 2, line 1 “The idea behind the proposed methodology is to exploit information of the context patterns X to reduce the bias effect of the mislabeled training patterns on the definition of the discriminating hyperplane of the SVM classifier, thus decreasing the sensitivity of the learning algorithm to unreliable training samples.” And column 2, paragraph 3, line 14 “This strategy is defined according to a learning procedure for the proposed CS4VM that is based on two main steps: 1) supervised learning with original training samples and classification of the (unlabeled) context patterns and 2) contextual semisupervised learning based on both original labeled patterns and semilabeled context patterns according to a novel cost function.” In other words, unlabeled context patterns is context of the first training data, semilabeled context patterns is context of the second training data, and mislabeled training patterns is a conflict between labels.)
	Both Bruzzone and the combination of Guan and Sun are directed to identifying mislabeled training data, among other things. In view of the teaching of the combination of Guan and Sun, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Bruzzone into the combination of Guan and Sun.  This would result in being able to identify mislabeled training data with the addition of using context.
	One of ordinary skill in the art would be motivated to do this to improve classification when the training set is not reliable. (Bruzzone, page 2142, column 1, paragraph 1, line 1 “This paper presents a novel context-sensitive semi-supervised support vector machine (CS4VM) classifier, which is aimed at addressing classification problems where the available training set 
Claims 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Guan, Sun, and Schikuta et al (A Cloud-Based Neural Network Simulation Environment, herein Schikuta).
Regarding Claim 13 
	The combination of Guan and Sun teaches the computer program product of claim 7,
	Thus far, the combination of Guan and Sun does not explicitly teach wherein the program code is provided as a service in a cloud environment. 
	Schikuta teaches wherein the program code is provided as a service in a cloud environment. (Schikuta, page 1, paragraph 1, line 1 “We present N2Sky, a novel Cloud-based neural network simulation environment.”  In other words a Cloud-based neural network simulation environment is program code provided as a service in a cloud environment.)
	Both Schikuta and the combination of Guan and Sun are directed to machine learning generally and the improvement of machine learning.  In view of the teaching of Guan and Sun it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Schikuta into the combination of Guan and Sun.  This would result in being able to provide the program code in a cloud environment.
One of ordinary skill in the art would be motivated to do this in order to provide the ability to improve data quality in a cloud environment. 
Claim 20 is a computer system claim corresponding to computer program product claim 13.  Outside of that, they are the same.  Therefore, Claim 20 is rejected for the same reasons as claim 13.
Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        
/BRIAN M SMITH/Primary Examiner, Art Unit 2122