Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to or not subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

DETAILED ACTION
Claims 1-20 are pending of which claims 1, 8 and 15 are independent.  
The application was filed on December 29, 2017, claims no domestic benefit, and claims no foreign priority.   This application is currently assigned to Verizon Media Inc.  

Information Disclosure Statement
No information disclosure statements (IDSs) have been filed in this application to date.

Oath/ADS
An Application Data Sheet was submitted 12/29/2017, and an Oath/declaration was submitted on 12/29/2017.
Comments
It is noted that, during examination, a claim must be given its broadest reasonable interpretation consistent with the specification.  Under a broadest reasonable interpretation, words of the claim must be given their plain meaning, unless such meaning is inconsistent with the specification.  M.P.E.P. 2173.01(I).  It is respectfully submitted that each claim is to be interpreted based on the 
It is noted that care be taken such that the claims themselves explicitly recite all the claimed elements relied upon in overcoming the rejections set forth herein.  That is, for any additional limitations discussed in the specification to be considered, the claims should be amended such that the limitations are explicitly recited in the claims themselves.  Appropriate consideration of each and every feature of the claims has been made.  
Applicants’ representative is welcome and encouraged to contact the examiner (Maryam Ipakchi) to discuss the application in an attempt to expedite prosecution.  The examiner may be reached via telephone at 571-270-3237, via direct fax at 571-270-4237 and/or via electronic mail (Maryam.ipakchi@uspto.gov) provided written authorization to communicate thereby is provided.  Any email communication must include written authorization for the USPTO to communicate with the Examiner concerning any subject matter of this application via electronic mail (see, MPEP 502.03).  Sample authorization language:  "Recognizing that Internet communications are not secure, I hereby authorize the USPTO to communicate with me concerning any subject matter of this application by electronic mail. I understand that a copy of these communications will be made of record in the application file.”
Interview requests may be made via an Interview Agenda setting forth proposed participants, items to be discussed and proposed interview times (see MPEP 713.01(III.)).  The Interview Agenda may be submitted via the AIR Form (http://www.uspto.gov/patent/uspto-automated-interview-request-air-

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Publication 20150332157 to Baughman et al. 

Regarding independent claims 1, 8 and 15, Baughman teaches:

Claim 1.  A method, implemented on a machine having at least one processor, storage, and a communication platform capable of connecting to a network for validating labels of training data, the method comprising: | Claim 8. A system for validating labels of training data, comprising: at least one processor configured to | Claim 15. A non-transitory computer readable medium including computer executable instructions, wherein the instructions, when executed by a computer, cause the computer to perform a method for validating labels of training data, the method comprising: (Baughman, FIGS. 1-4 memory 405, processor 404; [0004] Embodiments of the present invention disclose a method, computer program product, and system for locating natural resources … The computer processor combines one or more pairs of clusters of the plurality of clusters based, at least in part, on a similarity heuristic applied to the one or more pairs of clusters, and the computer processor determines a plurality of probabilities respectively corresponding to a validity of each hypothesis of the plurality of hypotheses).

 receiving a first group of data records associated with the training data, wherein each of the first group of data records includes a vector having at least one feature and a first label; (Baughman, FIGS. 1-4 memory 405, processor 404; [0011] Some embodiments of the present invention process information from sources of content, which is received as a dataset and includes: text, graphs, charts, data, and images of historical information, hereafter referred to as “historical data”, and text, graphs, charts, data, and images of scientific information, hereafter referred to as “scientific data”. Historical data may also include content from logs, content from blogs, content from posts and other Internet-based online sources. The sources of content also include real-time data from instruments, and sensors, as well as multimedia content, which may include images, presentations, graphics, audio, and video. The  summarizing the data, correlating and determining relationships, and creating clusters of like-topic content based on a heuristic. Some embodiments of the present invention extract features of the clusters to produce feature vectors, which are stored in a proximity matrix, to which a proximity algorithm is applied to further determine similarity of clusters to combine.; [0017]-[0018], [0022] named entities; [0032]-[0033]; [0074]-[0077] vectors, labels, features).

for each of the first group of data records, determining a second label based on the at least one feature in accordance with a first model; (Baughman, FIGS. 1-4 memory 405, processor 404; [0011] Some embodiments of the present invention process information from sources of content, which is received as a dataset and includes: text, graphs, charts, data, and images of historical information, hereafter referred to as “historical data”, and text, graphs, charts, data, and images of scientific information, hereafter referred to as “scientific data”. Historical data may also include content from logs, content from blogs, content from posts and other Internet-based online sources. The sources of content also include real-time data from instruments, and sensors, as well as multimedia content, which may include images, presentations, graphics, audio, and video. The information processing includes parsing large amounts of content included in the dataset, generating topic-based data, summarizing the data, correlating and determining relationships, and creating clusters of like-topic content based on a heuristic. Some embodiments of the present invention extract features of the clusters to produce feature vectors, which are stored in a proximity matrix, to which a proximity algorithm is applied to further determine similarity of clusters to combine.; [0012] further optimized; [0017]-[0018], [0032]-[0033]; [0076]-[0077] vectors, labels, features; examiner notes at the very least, e.g., proximity algorithm is applied to further determine similarity of clusters to combine may correspond to a second label that ‘further’ defines data); 

obtaining a loss based on the first label associated with the data record and the second label; classifying the data record as having an incorrect first label when the loss meets a pre-determined criterion (Baughman, FIGS. 1-4; [0012], [0073] By mapping the clusters of data to a topology, the data is represented in a similar context and comparisons of likeness and difference can be made. Main topics of clusters may be correlated with other clusters to determine how similar or different; [0075]-[0078] Determining the limit for the aggregation of clusters as well as those clusters that remain separate, creates the basis of a trained proximity matrix. To determine whether to keep clusters together or separate, a threshold of proximity is determined. The threshold that limits combining clusters is determined by scoring the favorable cluster space based on a metric, such as the overall accuracy of an experiment or a RAND index, for example. A RAND index or linear discriminate analysis may be used to tightly couple clusters or spread clusters out. A rand index is a measure of the similarity between two data clusters, and linear discriminate analysis finds a linear combination of features that can be used to separate or characterize two data clusters. The threshold is varied and cluster arrangements are iterated towards finding a best cluster space, which becomes the starting point of a model to be used to predict hypothesis probability); and 

generating a sub-group of the first group of data records, each of which has the incorrect first label (Baughman, FIGS. 1-4; [0075]-[0078] Determining the limit for the aggregation of clusters as well as those clusters that remain separate, creates the basis of a trained proximity matrix. To determine keep clusters together or separate, a threshold of proximity is determined. The threshold that limits combining clusters is determined by scoring the favorable cluster space based on a metric, such as the overall accuracy of an experiment or a RAND index, for example. A RAND index or linear discriminate analysis may be used to tightly couple clusters or spread clusters out. A rand index is a measure of the similarity between two data clusters, and linear discriminate analysis finds a linear combination of features that can be used to separate or characterize two data clusters. The threshold is varied and cluster arrangements are iterated towards finding a best cluster space, which becomes the starting point of a model to be used to predict hypothesis probability; examiner notes if not within a threshold, then not within a cluster and instead in another cluster, which may at the very least be considered a sub-group; features of sub-group?).

Baughman pertains to data analytics and systems and methods for generating topic based datasets by parsing content received from a plurality of information sources (Baughman, Abstract; [0001], [0004]) and Baughman teaches many features in relation to various exemplary embodiments.  It would have been obvious to one of ordinary skill in the art at the time of applicant’s invention to specifically employ features of various embodiments as a single combination in order to address different needs/purposes and meet specific quality and/or precision requirements (Baughman, [0101]-[0104]). 


Regarding dependent claims 2, 9 and 16, Baughman teaches:

wherein the loss is determined based on a discrepancy between the first label and the second label (Baughman, FIGS. 1-4; [0063]; [0075]-[0078]; [0080] A set of hypothesis with known answers or target values is used, and the features associated with the hypothesis with known results are fed into the model. The desired output of the model is to match the known results of the hypothesis applied to the model. The error is determined and is used to adjust the beta values of the feature variables of the model. Adjustments are regularized and a Hilger space is added to the error term for each exemplar, which is the combination of a hypothesis and the features supporting the hypothesis, to avoid over-fitting of the model. The model is iterated, making adjustments to the model to achieve or approach the known resulting values, for each of multiple hypothesis and known values.).


Regarding dependent claims 3, 10 and 17, Baughman teaches:

further comprising: correcting, for each of the sub-group of data records, the associated first label based on the corresponding second label (Baughman, FIGS. 1-4; [0063]; [0075]-[0078]; [0080] A set of hypothesis with known answers or target values is used, and the features associated with the hypothesis with known results are fed into the model. The desired output of the model is to match the known results of the hypothesis applied to the model. The error is determined and is used to adjust the beta values of the feature variables of the model. Adjustments are regularized and a Hilger space is added to the error term for each exemplar, which is the combination of a hypothesis and the features supporting the hypothesis, to avoid over-fitting of the model. The model is iterated, making adjustments to the model to achieve or approach the known resulting values, for each of multiple hypothesis and known values.).


Regarding dependent claims 4, 11 and 18, Baughman teaches:

further comprising: clustering the sub-group of data records into one or more clusters; and identifying, with respect to each of the one or more clusters, a cause that leads to the incorrect first labels (Baughman, FIGS. 1-4; [0017]-[0018]; [0027]-[0028]; [0075]-[0080] Resource locator program 300 trains models using supervised learning based on known answers (step 355). The logistic regression models are trained using supervised learning in which the correct results are known. … The final output generating probabilities of the hypothesis that can be, in one embodiment of the present invention, displayed as a heat map. Resource locator program 300 applies supervised learning to the best cluster space result of the unsupervised learning. To insure the best feature vectors, based on evidence having the highest confidence levels, are used for dimensions of evidence defining the routes of models, resource locator program 300 uses distributions of distributions of source data, processed by analytic engines to select the high probability feature vectors.).

Regarding dependent claims 5, 12 and 19, Baughman teaches:

further comprising: determining, for at least some cause identified, a correction directed to a labeling model used in generating a corresponding first label. (Baughman, FIGS. 1-4; [0017]-[0018]; [0027]-[0028] By successive refinement, based on the data applied to the models, the multiple models are trained. The supervised learning algorithm is applied to the models to best learn how to combine deep evidence using a probability density function, in which all the features directed to a model are combined to result in a probability aligning with the particular subject determination of the model. Each model within a route of multiple models is a probability density function. The trained models determine and apply confidence values to the plurality of candidate hypotheses or topics from the features of the proximity matrix, and the confidence values or probabilities of the candidate hypotheses can be reported. A report can be produced that includes the top “N” candidate hypotheses, based on the determined probabilities and/or confidence values of the hypotheses, where “N” can be a user set number. The dimensions of evidence and features used in support of the probabilities of the hypotheses can be traced back to the images and text content sources on which they are based; [0075]-[0080] Resource locator program 300 trains models using supervised learning based on known answers (step 355). The logistic regression models are trained using supervised learning in which the correct results are known. … The final output generating probabilities of the hypothesis that can be, in one embodiment of the present invention, displayed as a heat map. Resource locator program 300 applies supervised learning to the best cluster space result of the unsupervised learning. To insure the best feature vectors, based on evidence having the highest confidence levels, are used for dimensions of evidence defining the routes of models, resource locator program 300 uses distributions of distributions of source data, processed by analytic engines to select the high probability feature vectors.).



Regarding dependent claims 6, 13 and 20, Baughman teaches:

wherein the labeling model corresponds to a heuristic labeling model associated with a threshold; and the correction directed to the heuristic labeling model is to adjust the threshold of the heuristic labeling model. (Baughman, FIGS. 1-4; [0023]-[0034]; [0057] output data 204 feeds to operational steps of resource locator program 300 in which different data outputs are correlated and cross correlated, and like-data is clustered, based on a heuristic. An example of a heuristic to use in clustering is to apply a scheme of assigning numeric values to the data and using a proximity algorithm to determine Cartesian distances between data and data clusters; [0071], [0075]-[0078] Determining the limit for the aggregation of clusters as well as those clusters that remain separate, creates the basis of a trained proximity matrix. To determine whether to keep clusters together or separate, a threshold of proximity is determined; [0080]; examiner notes associated how? Adjust how/based on what?).


Double Patenting
Applicant appears to have multiple co-pending related applications.  Applicant should take caution to ensure that related applications do not include claims of identical scope or of obvious variants thereof.  In view of this notice to the Applicant and Applicant’s own superior knowledge of pending and/or issued related applications, the examiner retains the ability to issue a double patenting rejection in a Final rejection if appropriate without establishing a new grounds of rejection.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  For the prior art applied to the claims, as set forth above, the Examiner has cited particular columns and line numbers (or paragraphs) in the references for the convenience of the applicant.  Although the specified citations are representative of the teachings of the art and are applied to specific imitations within the individual claim, other passages and figures may apply as well. More particularly, e.g., in the instances the Examiner has identified Figures of the applied prior art reference, it is understood that the corresponding portions of the written description describing the identified Figures is relied upon.  It is respectfully requested from the Applicant in preparing responses, to fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or identified by the Examiner. The entire reference(s) is/are to be considered to provide disclosure relating to the claimed invention.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARYAM M IPAKCHI whose telephone number is (571)270-3237.  The examiner can normally be reached on M-F Flex 6-3pm (AltFriOff).  Any interview requests should be made via an Interview Agenda setting forth proposed participants, items to be discussed and proposed interview times (see MPEP 713.01(III.)).  The Interview Agenda may be submitted via the AIR Form (http://www.uspto.gov/patent/uspto-automated-interview-request-air-form.html) and/or faxed to the examiner at (571)270-4237 so that the Examiner may review the materials in advance to provide meaningful discussion in order to advance prosecution.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar, can be reached on (571)270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 


/MARYAM M IPAKCHI/               Primary Examiner, Art Unit 2171