DETAILED ACTION
This action is responsive to the Amendment filed on 06/30/2022. Claims 1-20 remain pending in the case. Claims 1, 11, and 16 are independent claims.

Claim Objections
Claims 2, 6-15, and 18-20 are objected to because of the following informalities:
Claim 2:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 6:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 7:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Line 4 recites “data to be labeledof difficulty and.” This self-evidently contains multiple errors.
Claim 8:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 9:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 10:
Line 2 recites “being an aid to said annotator {…}.” It is respectfully submitted that the metes and bounds of the word “aid” in this case would appear to be virtually synonymous with the word “help,” and as such, its metes and bounds are improperly subjective/relative in nature (in other words, what would or would not be sufficiently considered to be “helpful” or not would be in the eye of the beholder).
Claim 11:
Line 5 recites the concept of “aid.” It is respectfully submitted that the metes and bounds of the word “aid” in this case would appear to be virtually synonymous with the word “help,” and as such, its metes and bounds are improperly subjective/relative in nature (in other words, what would or would not be sufficiently considered to be “helpful” or not would be in the eye of the beholder).
Claim 13:
Line 3 recites the concept of “aid.” It is respectfully submitted that the metes and bounds of the word “aid” in this case would appear to be virtually synonymous with the word “help,” and as such, its metes and bounds are improperly subjective/relative in nature (in other words, what would or would not be sufficiently considered to be “helpful” or not would be in the eye of the beholder).
Claim 14:
Line 2 recites “data to be .” where “data” was apparently intended.
Claim 18:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 19:
Line 2 recites “said at least one of internal consistency and external inconsistency” where “[[said]]the at least one of the internal inconsistency and the external inconsistency” was apparently intended.
Claim 20:
Line 3 recites “being an aid to said annotator {…}.” It is respectfully submitted that the metes and bounds of the word “aid” in this case would appear to be virtually synonymous with the word “help,” and as such, its metes and bounds are improperly subjective/relative in nature (in other words, what would or would not be sufficiently considered to be “helpful” or not would be in the eye of the beholder).
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5, 7, and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention. The term “difficulty” in claims 5, 7, and 14 is a relative term which render the claim indefinite. The terms are not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. In other words, the term is relative/subjective in nature and is indefinite because as presented, its metes and bounds cannot be objectively ascertained (i.e. what a person would or would not find “difficult” is subjective in nature). For purposes of prior art analysis, the Office will adopt a broadest reasonable interpretation for each term and will attempt to extrapolate as to Applicant’s underlying intentions for each term in light of the Specification.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Drouhard, M. et al (2017, April). Aeonium: Visual analytics to support collaborative qualitative coding. In Pacific Visualization Symposium (PacificVis), 2017 IEEE (pp. 220-229). IEEE (included in the IDS filed on 12/03/2018, hereinafter “Drouhard”).

As to independent claims 1 and 16, Drouhard shows a method and a concomitant computer for improving performance of said computer implementing a machine learning system [e.g. the “Aeonium System” (page 4, section 4)], comprising:
providing, via a graphical user interface, to an annotator, unlabeled corpus data to be labeled [“{…} Aeonium has two interfaces: one for coding data, shown in Figure 1, and another for reviewing codes, keywords, and definitions, shown in Figure 2. The coding interface supports efficient coding decisions by locating the color-coded definitions centrally on the screen and by showing keywords (if present within a given tweet) highlighted in the color used to represent the associated code. The coding interface also affords more in-depth analysis by showing extended definitions and visual overviews of keywords for a selected code in the lower panel. {…}” (page 4, section 4, 2nd paragraph)];
obtaining, via said graphical user interface, labels for said unlabeled corpus data to be labeled [e.g. obtaining labels/“codes” for said unlabeled “tweets”/corpus data to be labeled (fig. 1)];
detecting, with a consistency calculation routine, concurrent with said labeling, a plurality of inconsistencies comprising at least one of internal inconsistency in said labeling and external inconsistency in said labeling [“Inter-rater reliability (IRR) is a widely used measure for evaluating consistency between coders. There are various ways to calculate IRR, and one of the most common methods is Cohen's Kappa {…} In our study, we use IRR (specifically Cohen's Kappa) as a metric for agreement or consistent application of codes between two coders in order to measure improvement in consistency under various conditions.” (page 2, section 2.5)];
responsive to said detection of said at least one of internal inconsistency and external inconsistency, intervening in said labeling, concurrent with said labeling, with a reactive intervention subsystem until said at least one of internal inconsistency in said labeling and external inconsistency in said labeling is addressed; completing said labeling subsequent to said intervening [“Aeonium allows coders to explicitly flag ambiguous data through the interface, and it also infers ambiguity from disagreement between collaborators. The combination of explicit ambiguity labeling and implicit recognition of ambiguity through disagreement supports qualitative coders' analysis of challenging data. {…}” (page 3, section 3.1, 3rd paragraph);
“In addition to drawing out ambiguity, Aeonium facilitates reviewing and resolving disagreements. In the review interface, the code comparison table summarizes how users and their partners agree or disagree with each other. For instance, in Figure 2.7, the interface shows that among the tweets coded with the "Support" label by the current user, his/her partner has disagreed twice, with one tweet coded as "Rejection" and another as "Uncodable." By clicking on each row, users can filter to tweets that belong to the selected code combination and focus on analyzing, providing feedback, and resolving inconsistency. Currently, for each tweet on which partners' codes disagreed, Aeonium provides three response options for the disagreement: "My code is correct", "My partner's code is correct", or "Unsure." "Unsure" may indicate that either code might apply or that there is insufficient context to make a coding judgment. When a user indicates that his/her partner's original code is correct, the system will ask that user whether or not to change the assigned code to match the partner's code. This feedback loop can help coders become more consistent with each other over time, and it implicitly supports the iterative nature of qualitative coding (design objective 3). Additionally, since disagreement can be an indication of ambiguity, having pairwise comparison also contributes to our design objective 1 to draw out ambiguous data.” (page 4, section 4.2)
For even further context/examples of the “concurrent with said labeling” aspects, see also the plurality of “intervention” alternatives illustrated in figs. 1 and 2 that are provided “concurrently with said labeling.” See also how “{…} Aeonium lets users provide expanded definitions based on how code meanings are evolving, and this information is displayed to collaborators in both the coding and review interfaces, in which users see their partners' definitions and their own extended definitions in addition to the master coder definitions.” (page 5, section 4.4)];
carrying out training of said machine learning system to provide a trained machine learning system, based on results of said completing of said labeling subsequent to said intervening; and carrying out classifying new data with said trained machine learning system [“Aeonium supports qualitative focused coding, facilitating the review of coded data between coding partners. {…} Aeonium may be used for qualitative coding of any short text documents, but we have conducted our initial studies with a tweet dataset. 
Aeonium trains a support vector machine (SVM) classifier for each user based on their labels (i.e. coded tweets) and features (or keywords). Keyword features consist of system keywords (i.e. bag-of-words unigram features) extracted from the data set and user-defined keywords (i.e. one or more selected words that are not necessarily contiguous) extracted from coded tweets that are relevant to explicitly explaining the code decision. The feature value is computed by matching the words in the tweet to the keywords. The classifiers are used only for the purpose of suggesting tweets to label. Aeonium has two interfaces: one for coding data, shown in Figure 1, and another for reviewing codes, keywords, and definitions, shown in Figure 2. The coding interface supports efficient coding decisions by locating the color-coded definitions centrally on the screen and by showing keywords (if present within a given tweet) highlighted in the color used to represent the associated code. The coding interface also affords more in-depth analysis by showing extended definitions and visual overviews of keywords for a selected code in the lower panel. The review interface facilitates negotiation of assigned codes between a pair of coders, as well as reinterpretation of data given evolving code definitions and new insights from data.” (page 4, section 4) | For further context/examples, see also page 1, section 2.1 and page 2, section 2.2.].

As to dependent claim 2, Drouhard further shows:
said at least one of internal consistency and external inconsistency comprises at least said external inconsistency [“{…} factors such as mood changes, attention, and memory can impact consistency between different coders, or inter-rater reliability, as well as consistency of the same coder at different points in time. {…}” (page 3, section 3.1)];
and said intervening comprises providing a difference view via said graphical user interface, said difference view revealing labeling disagreements between said annotator and at least one additional annotator [“In addition to drawing out ambiguity, Aeonium facilitates reviewing and resolving disagreements. In the review interface, the code comparison table summarizes how users and their partners agree or disagree with each other. {…}” (page 4, Section 4.2)].

As to dependent claim 3, Drouhard further shows:
said intervening comprises providing a label versioning view via said graphical user interface, said label versioning view explaining how and why a label has changed over time, displayed with data that prompted the change [“{…} Regardless of the objective for a particular project, iteration and reflection on the data and codes are critical, so they must be supported to facilitate qualitative coding appropriately. 
Aeonium facilitates various levels of iteration and deliberation on the data. The coding interface helps link the discovery of new concepts and ideas to previously coded data through code definitions, example data, and highlighted keywords. The review interface supports resolution of disagreements, evaluation of ambiguity, and evolving code definitions. Shifting between the interfaces is also closely aligned with the qualitative coding practice of alternating between coding and reflecting.” (page 4, 2nd – 3rd paragraphs)
“Fig. 1. Aeonium's coding interface consists of two panels: The top panel primarily supports the coding task and displays a single tweet to code (1) with keywords that the model recognizes highlighted (2), buttons to flag that tweet as ambiguous, save it, or make it an exemplar tweet for the selected code (3), and a row of color-coded buttons representing codes in the coding schema with the code definition and an exemplar tweet from master coders (4). The bottom panel enhances the coding task by providing additional information such as the code definitions and examples of the master, user, and partner (5) to highlight descrepancies, distribution of coded tweets for the system, user, and partner keywords (6) to illustrate keyword relevance, and the user's previously coded tweets (7) for history.” (page 5; fig. 1)].

As to dependent claims 4 and 17, Drouhard further shows:
said intervening comprises providing exemplars for at least one incorrect label via said graphical user interface, said exemplars comprising historical examples that have agreement corresponding to said at least one incorrect label [“Aeonium's coding interface shows codes in a given codebook, along with their definitions and examples just below the tweet. By placing these definitions centrally within interface, users always have at least peripheral awareness of the definitions. This functionality serves design objective 2. Researchers in our interviews indicated that codes tend have vague boundaries, so centralizing code definitions can help coders make more efficient, better informed coding decisions. Additionally, since each user likely has a slightly different understanding of code meanings, explicit support for negotiation can improve consistency and facilitate iteration (design objective 3). The review interface in Aeonium lets users provide expanded definitions based on how code meanings are evolving, and this information is displayed to collaborators in both the coding and review interfaces, in which users see their partners' definitions and their own extended definitions in addition to the master coder definitions.” (page 5, section 4.4)
“Fig. 2. The review interface supports understanding and negotiation of code boundaries and discrepancies between coders. Its tab-based view allows for switching between the detail view of each code (1 ). Each code tab provides comparison and edit capability of code definitions (2-4), overview of the data distribution (5), and distribution of coded tweets for keywords extracted (6), comparison of codes (7) to summarize agreement or disagreement between coders, and the list of coded tweets for analysis. Tweets can be filtered with search terms (8) or by selecting a keyword (from 6) or a code pair (from 7). Users can reevaluate codes and provide feedback about a disagreement through a dropdown menu (9) or provide more context for their decisions by adding keywords extracted directly from the tweets (10).” (page 6, fig. 2)].

As to dependent claim 5, Drouhard further shows:
said intervening comprises retest and evaluation, said retest and evaluation comprising administering to said annotator, via said graphical user interface, a quiz with instant feedback, based on agreement labeled data which is causing difficulty for said annotator [“In addition to drawing out ambiguity, Aeonium facilitates reviewing and resolving disagreements. In the review interface, the code comparison table summarizes how users and their partners agree or disagree with each other. For instance, in Figure 2.7, the interface shows that among the tweets coded with the "Support" label by the current user, his/her partner has disagreed twice, with one tweet coded as "Rejection" and another as "Uncodable." By clicking on each row, users can filter to tweets that belong to the selected code combination and focus on analyzing, providing feedback, and resolving inconsistency. Currently, for each tweet on which partners' codes disagreed, Aeonium provides three response options for the disagreement: "My code is correct", "My partner's code is correct", or "Unsure." "Unsure" may indicate that either code might apply or that there is insufficient context to make a coding judgment. When a user indicates that his/her partner's original code is correct, the system will ask that user whether or not to change the assigned code to match the partner's code. This feedback loop can help coders become more consistent with each other over time, and it implicitly supports the iterative nature of qualitative coding (design objective 3). Additionally, since disagreement can be an indication of ambiguity, having pairwise comparison also contributes to our design objective 1 to draw out ambiguous data.” (page 4, section 4.2) 
“{…} All participants completed a survey in which they labeled 20 tweets. Participants were provided the definitions for the codes given in Section 5.3, and they were required to provide a code for each tweet without the option to label tweets as ambiguous. After labeling the tweets in the survey, participants were asked to rate their confidence in the codes they had applied to the tweets on a scale of 1 to 5.” (page 7, section 5.4.2) | For even further context/examples, see also: pages 3-4, section 3.3.].

As to dependent claim 6, Drouhard further shows:
said at least one of internal consistency and external inconsistency comprises at least said external inconsistency [“{…} factors such as mood changes, attention, and memory can impact consistency between different coders, or inter-rater reliability, as well as consistency of the same coder at different points in time. {…}” (page 3, section 3.1)];
and said intervening comprises label curation via said graphical user interface, to facilitate discussion about meaning of a given label with at least one additional annotator [“{…} Aeonium supports this aim in two ways: the ML model predicts tweets for which partners may disagree, and users may explicitly flag ambiguous tweets during coding. Showing tweets that are likely to be ambiguous or inconsistent between coders draws users' attention to these tweets and encourages a dialogue between coders to improve mutual understanding and consistent coding. The ambiguity flag is also useful for exploring sources of confusion or uncertainty in coding.” (page 4, section 4.1)
“Fig. 2. The review interface supports understanding and negotiation of code boundaries and discrepancies between coders. Its tab-based view allows for switching between the detail view of each code (1 ). Each code tab provides comparison and edit capability of code definitions (2-4), overview of the data distribution (5), and distribution of coded tweets for keywords extracted (6), comparison of codes (7) to summarize agreement or disagreement between coders, and the list of coded tweets for analysis. Tweets can be filtered with search terms (8) or by selecting a keyword (from 6) or a code pair (from 7). Users can reevaluate codes and provide feedback about a disagreement through a dropdown menu (9) or provide more context for their decisions by adding keywords extracted directly from the tweets (10).” (page 6, fig. 2)].

As to dependent claim 7, Drouhard further shows:
said at least one of internal consistency and external inconsistency comprises at least said external inconsistency [“{…} factors such as mood changes, attention, and memory can impact consistency between different coders, or inter-rater reliability, as well as consistency of the same coder at different points in time. {…}” (page 3, section 3.1)];
and said intervening comprises re-ordering, by difficulty in the labeling, said unlabeled corpus data to be labeledof difficulty and [“As described in design objective 1, qualitative researchers are interested in ways to draw out ambiguous data or inconsistent codes in order to better shape the definitions and negotiate code boundaries. Aeonium supports this aim in two ways: the ML model predicts tweets for which partners may disagree, and users may explicitly flag ambiguous tweets during coding. Showing tweets that are likely to be ambiguous or inconsistent between coders draws users' attention to these tweets and encourages a dialogue between coders to improve mutual understanding and consistent coding. The ambiguity flag is also useful for exploring sources of confusion or uncertainty in coding. 
From our preliminary interviews with qualitative researchers, we determined that when coders initially disagree on appropriate codes or initially identify multiple mutually exclusive codes that may be applicable, these data require additional attention. For our primary evaluation of Aeonium, the metric we used to predict "ambiguous" data was disagreement between partners on prior coding decisions. Since partners may change their codes through the feedback dropdown menu in the review interface, Aeonium's ML model predicted ambiguity based on tweets for which partners continued to disagree after reviewing. After a pair completes the review stage, the system will train a classifier for each user based on their existing coded tweets after review. Then for the remaining tweets that have not been coded, the system predicts labels for both partners. For a stage of coding focused on ambiguous data, the system will sort uncoded tweets based on level of predicted disagreement (i.e., tweets for which partners are predicted to disagree with the highest confidence), and the dataset for the ambiguous stage will include tweets for which partners are most most strongly predicted to apply inconsistent codes. In order to better surface ambiguity in the future, Aeonium's ML models will soon incorporate as features the explicit flagging of ambiguous tweets and the "Unsure" responses from the disagreement feedback dropdown menu, which indicate uncertainty about the decision.” (page 4, section 4.1)].

As to dependent claims 8 and 18, Drouhard further shows:
said at least one of internal consistency and external inconsistency comprises at least said internal inconsistency [“{…} factors such as mood changes, attention, and memory can impact consistency between different coders, or inter-rater reliability, as well as consistency of the same coder at different points in time. {…}” (page 3, section 3.1)];
and said detecting of said internal consistency comprises periodically retesting said annotator on a portion of previously-labeled data using at least one predefined selection criterion [“Keyword highlighting works in three ways in Aeonium. During coding, if a user-extracted keyword exists in the tweet text, it will be highlighted in the color of its associated code, drawing attention to key information and mitigate the risks of inattentive coding. Secondly, when reviewing coded tweets, a user can add new keywords from a tweet's text to explain the context for choosing a particular code. Finally, after adding a new keyword, users can see how this keyword is distributed over codes. If a keyword is not very predictive or informative (i.e., it is not particularly well aligned with a specific code), users will be able to recognize that it may not be a useful keyword to explain a coding decision. As outlined in describing design objective 2, researchers want to understand context around coding decisions, and highlighting keywords identifies some of the explicit context. In addition, when keywords are shown highlighted according to code color in the coding interface, users can assess how well keywords align with their assigned codes. Since the coding process is iterative as stated in design objective 3, users can reflect on the association between keywords and codes. For example, when a user sees a keyword highlighted in green as "Support" in the text, but he/she thinks the tweet should be coded "Rejection", the user can reevaluate how well the keyword aligns with the "Support" code.
4.4 Code Definitions for Context Awareness and Iterative Negotiation of Code boundaries
Aeonium's coding interface shows codes in a given codebook, along with their definitions and examples just below the tweet. By placing these definitions centrally within interface, users always have at least peripheral awareness of the definitions. This functionality serves design objective 2. Researchers in our interviews indicated that codes tend have vague boundaries, so centralizing code definitions can help coders make more efficient, better informed coding decisions. Additionally, since each user likely has a slightly different understanding of code meanings, explicit support for negotiation can improve consistency and facilitate iteration (design objective 3). The review interface in Aeonium lets users provide expanded definitions based on how code meanings are evolving, and this information is displayed to collaborators in both the coding and review interfaces, in which users see their partners' definitions and their own extended definitions in addition to the master coder definitions.” (page 5, sections 4.3 and 4.4)].

As to dependent claims 9 and 19, Drouhard further shows:
said at least one of internal consistency and external inconsistency comprises at least said external inconsistency [“{…} factors such as mood changes, attention, and memory can impact consistency between different coders, or inter-rater reliability, as well as consistency of the same coder at different points in time. {…}” (page 3, section 3.1)];
and said detecting of said external consistency comprises measuring inter-annotator consistency in real time [“Inter-rater reliability (IRR) is a widely used measure for evaluating consistency between coders. There are various ways to calculate IRR, and one of the most common methods is Cohen's Kappa [ 12]. {…} In our study, we use IRR (specifically Cohen's Kappa) as a metric for agreement or consistent application of codes between two coders in order to measure improvement in consistency under various conditions.” (page 2, section 2.5)].

As to dependent claims 10 and 20, Drouhard further shows wherein:
further comprising embedding at least one affordance in said graphical user interface, the at least one affordance being an aid to said annotator in labeling said unlabeled corpus data to be labeled [“{…} Aeonium uses colors to facilitate coding visually. However, Aeonium's design also explicitly facilitates evolving code definitions and includes a specialized interface for code review in order to better support iteration in coding. Some other tools offer computational tools for inter-coder reliability to identify inconsistency, but Aeonium presents visual overviews for classes of disagreement. It also includes mechanisms for feedback, and its ML model can identify uncoded data for which coders are likely to disagree.” (page 2, section 2.2) 
“{…} Aeonium has two interfaces: one for coding data, shown in Figure 1, and another for reviewing codes, keywords, and definitions, shown in Figure 2. The coding interface supports efficient coding decisions by locating the color-coded definitions centrally on the screen and by showing keywords (if present within a given tweet) highlighted in the color used to represent the associated code. The coding interface also affords more in-depth analysis by showing extended definitions and visual overviews of keywords for a selected code in the lower panel. The review interface facilitates negotiation of assigned codes between a pair of coders, as well as reinterpretation of data given evolving code definitions and new insights from data.” (page 4, section 4)].

As to independent claim 11, Drouhard shows a method for improving performance of a computer implementing a machine learning system [e.g. the “Aeonium System” (page 4, section 4)], said method comprising:
providing, via a graphical user interface, to an annotator, unlabeled corpus data to be labeled; embedding, concurrent with corpus data labeling, at least one affordance in said graphical user interface, the at least one affordance being an aid to said annotator in labeling said unlabeled corpus data to be labeled [“{…} Aeonium has two interfaces: one for coding data, shown in Figure 1, and another for reviewing codes, keywords, and definitions, shown in Figure 2. The coding interface supports efficient coding decisions by locating the color-coded definitions centrally on the screen and by showing keywords (if present within a given tweet) highlighted in the color used to represent the associated code. The coding interface also affords more in-depth analysis by showing extended definitions and visual overviews of keywords for a selected code in the lower panel. The review interface facilitates negotiation of assigned codes between a pair of coders, as well as reinterpretation of data given evolving code definitions and new insights from data.” (page 4, section 4)
For even further context/examples of the “concurrent with said labeling” aspects, see also the plurality of “intervention” alternatives illustrated in figs. 1 and 2 that are provided “concurrently with said labeling.” See also how “{…} Aeonium lets users provide expanded definitions based on how code meanings are evolving, and this information is displayed to collaborators in both the coding and review interfaces, in which users see their partners' definitions and their own extended definitions in addition to the master coder definitions.” (page 5, section 4.4)];
obtaining, via said graphical user interface, labels for said unlabeled corpus data to be labeled, said labels being provided while said annotator is using said at least one affordance [e.g. obtaining labels/“codes” for said unlabeled “tweets”/corpus data to be labeled while using the affordances illustrated in fig. 1];
carrying out training of said machine learning system to provide a trained machine learning system based on said labels provided while said annotator is using said at least one affordance; and carrying out classifying new data with said trained machine learning system [“Aeonium supports qualitative focused coding, facilitating the review of coded data between coding partners. {…} Aeonium may be used for qualitative coding of any short text documents, but we have conducted our initial studies with a tweet dataset. 
Aeonium trains a support vector machine (SVM) classifier for each user based on their labels (i.e. coded tweets) and features (or keywords). Keyword features consist of system keywords (i.e. bag-of-words unigram features) extracted from the data set and user-defined keywords (i.e. one or more selected words that are not necessarily contiguous) extracted from coded tweets that are relevant to explicitly explaining the code decision. The feature value is computed by matching the words in the tweet to the keywords. The classifiers are used only for the purpose of suggesting tweets to label. Aeonium has two interfaces: one for coding data, shown in Figure 1, and another for reviewing codes, keywords, and definitions, shown in Figure 2. The coding interface supports efficient coding decisions by locating the color-coded definitions centrally on the screen and by showing keywords (if present within a given tweet) highlighted in the color used to represent the associated code. The coding interface also affords more in-depth analysis by showing extended definitions and visual overviews of keywords for a selected code in the lower panel. The review interface facilitates negotiation of assigned codes between a pair of coders, as well as reinterpretation of data given evolving code definitions and new insights from data.” (page 4, section 4) | For further context/examples, see also page 1, section 2.1 and page 2, section 2.2.].

As to dependent claim 12, Drouhard further shows:
wherein, in said embedding step, said at least one affordance comprises a definition and example for each label [“Fig. 1. Aeonium's coding interface consists of two panels: The top panel primarily supports the coding task and displays a single tweet to code (1) with keywords that the model recognizes highlighted (2), buttons to flag that tweet as ambiguous, save it, or make it an exemplar tweet for the selected code (3), and a row of color-coded buttons representing codes in the coding schema with the code definition and an exemplar tweet from master coders (4). The bottom panel enhances the coding task by providing additional information such as the code definitions and examples of the master, user, and partner (5) to highlight descrepancies, distribution of coded tweets for the system, user, and partner keywords (6) to illustrate keyword relevance, and the user's previously coded tweets (7) for history.” (page 5; fig. 1)].

As to dependent claim 13, Drouhard further shows:
wherein, in said embedding step, said at least one affordance comprises a wizard routine which queries said annotator with a series of questions, via said graphical user interface, the at least one affordance being an aid to said annotator in label selection [“In addition to drawing out ambiguity, Aeonium facilitates reviewing and resolving disagreements. In the review interface, the code comparison table summarizes how users and their partners agree or disagree with each other. For instance, in Figure 2.7, the interface shows that among the tweets coded with the "Support" label by the current user, his/her partner has disagreed twice, with one tweet coded as "Rejection" and another as "Uncodable." By clicking on each row, users can filter to tweets that belong to the selected code combination and focus on analyzing, providing feedback, and resolving inconsistency. Currently, for each tweet on which partners' codes disagreed, Aeonium provides three response options for the disagreement: "My code is correct", "My partner's code is correct", or "Unsure." "Unsure" may indicate that either code might apply or that there is insufficient context to make a coding judgment. When a user indicates that his/her partner's original code is correct, the system will ask that user whether or not to change the assigned code to match the partner's code. This feedback loop can help coders become more consistent with each other over time, and it implicitly supports the iterative nature of qualitative coding (design objective 3). Additionally, since disagreement can be an indication of ambiguity, having pairwise comparison also contributes to our design objective 1 to draw out ambiguous data.” (page 4, section 4.2) 
“{…} All participants completed a survey in which they labeled 20 tweets. Participants were provided the definitions for the codes given in Section 5.3, and they were required to provide a code for each tweet without the option to label tweets as ambiguous. After labeling the tweets in the survey, participants were asked to rate their confidence in the codes they had applied to the tweets on a scale of 1 to 5.” (page 7, section 5.4.2) | For even further context/examples, see also: pages 3-4, section 3.3.].

As to dependent claim 14, Drouhard further shows:
wherein, in said embedding step, said at least one affordance comprises a re-ordering, by difficulty in the labeling, of said unlabeled corpus data to be [“As described in design objective 1, qualitative researchers are interested in ways to draw out ambiguous data or inconsistent codes in order to better shape the definitions and negotiate code boundaries. Aeonium supports this aim in two ways: the ML model predicts tweets for which partners may disagree, and users may explicitly flag ambiguous tweets during coding. Showing tweets that are likely to be ambiguous or inconsistent between coders draws users' attention to these tweets and encourages a dialogue between coders to improve mutual understanding and consistent coding. The ambiguity flag is also useful for exploring sources of confusion or uncertainty in coding. 
From our preliminary interviews with qualitative researchers, we determined that when coders initially disagree on appropriate codes or initially identify multiple mutually exclusive codes that may be applicable, these data require additional attention. For our primary evaluation of Aeonium, the metric we used to predict "ambiguous" data was disagreement between partners on prior coding decisions. Since partners may change their codes through the feedback dropdown menu in the review interface, Aeonium's ML model predicted ambiguity based on tweets for which partners continued to disagree after reviewing. After a pair completes the review stage, the system will train a classifier for each user based on their existing coded tweets after review. Then for the remaining tweets that have not been coded, the system predicts labels for both partners. For a stage of coding focused on ambiguous data, the system will sort uncoded tweets based on level of predicted disagreement (i.e., tweets for which partners are predicted to disagree with the highest confidence), and the dataset for the ambiguous stage will include tweets for which partners are most most strongly predicted to apply inconsistent codes. In order to better surface ambiguity in the future, Aeonium's ML models will soon incorporate as features the explicit flagging of ambiguous tweets and the "Unsure" responses from the disagreement feedback dropdown menu, which indicate uncertainty about the decision.” (page 4, section 4.1)].

As to dependent claim 15, Drouhard further shows:
wherein, in said embedding step, said at least one affordance comprises providing data related to how at least one additional annotator has carried out labeling [“Fig. 1. Aeonium's coding interface consists of two panels: The top panel primarily supports the coding task and displays a single tweet to code (1) with keywords that the model recognizes highlighted (2), buttons to flag that tweet as ambiguous, save it, or make it an exemplar tweet for the selected code (3), and a row of color-coded buttons representing codes in the coding schema with the code definition and an exemplar tweet from master coders (4). The bottom panel enhances the coding task by providing additional information such as the code definitions and examples of the master, user, and partner (5) to highlight descrepancies, distribution of coded tweets for the system, user, and partner keywords (6) to illustrate keyword relevance, and the user's previously coded tweets (7) for history.” (page 5; fig. 1)].

Response to Arguments
Applicant’s arguments have been fully considered but they are not persuasive. Applicant argues:
“    Drouhard discusses that participants were explicitly instructed "not to worry about accuracy” in coding, but rather to simply choose the code they thought most appropriate, and that each stage is composed of two sub-stages: first coding and then review. Applicant maintains that such a teaching teaches away from "intervening in said labeling with a reactive intervention subsystem until said at least one of internal inconsistency in said labeling and external inconsistency in said labeling is addressed"; that is, there is no intervening in the labeling process of Drouhard. Moreover, such an intervention would not be needed given that Drouhard teaches to essentially ignore accuracy. Drouhard does not disclose or suggest "detecting, with a consistency calculation routine, concurrent with said labeling, a plurality of inconsistencies comprising at least one of internal inconsistency in said labeling and external inconsistency in said labeling; responsive to said detection of said at least one of internal inconsistency and external inconsistency, intervening in said labeling with a reactive intervention subsystem until said at least one of internal inconsistency in said labeling and external inconsistency in said labeling is addressed."”

The Office respectfully disagrees. First, it is noted that the backstory (namely, the instructions that were given to participants of a study) upon which Applicants rely were not only never actually relied upon or mapped to by the Office, but also said backstory “does not constitute a teaching away from any of these alternatives because such disclosure does not criticize, discredit, or otherwise discourage the solution claimed.” 1 Instead, the actual graphical user interface functionalities of the “Aeonium” program, which explicitly recite an operability to “intervene in said labeling, concurrent with said labeling,” were mapped to the claimed limitations. For example, see the plurality of “intervention” alternatives illustrated in figs. 1 and 2 that are provided “concurrently with said labeling.” See also how “{…} Aeonium lets users provide expanded definitions based on how code meanings are evolving, and this information is displayed to collaborators in both the coding and review interfaces, in which users see their partners' definitions and their own extended definitions in addition to the master coder definitions.” (page 5, section 4.4) (for even further context/examples, see also: page 3, section 3.1, 3rd paragraph; page 4, section 4.2; and the plurality of updating/refining intervention scenarios executed concurrently with labeling in page 7, section 5.2).

“    Drouhard discusses sorting "uncoded tweets based on level of predicted disagreement (i.e., tweets for which partners are predicted to disagree with the highest confidence)." Drouhard does not, however, disclose or suggest that intervening that comprises re-ordering, by difficulty in the labeling, said unlabeled corpus data to be labeled.
Thus, Drouhard does not disclose or suggest "said intervening comprises re-ordering, by difficulty in the labeling, said unlabeled corpus data to be labeled," as variously recited by Claims 7 and 14, as amended.”

The Office respectfully disagrees. First, as noted above, the “difficulty” limitation to which Applicant refers herein is still indefinite, and affects how the claims are interpreted for purposes of prior art analysis since the metes and bounds of “difficulty” cannot be objectively ascertained as currently drafted. Moreover, the Office respectfully maintains that the cited functionalities (namely, the operability to sort by disagreement, which directly corresponds to levels of difficulty/uncertainty/ambiguity) reasonably reads on the features to which Applicant refers.  

“    Drouhard discusses that Inter-rater reliability (IRR) is a widely used measure for evaluating consistency between coders. Drouhard does not, however, disclose or suggest that a detecting of external consistency comprises measuring inter-annotator consistency in real time.
Thus, Drouhard does not disclose or suggest "said detecting of said external consistency comprises measuring inter-annotator consistency in real time," as recited by Claim 9, as amended.”

The Office respectfully disagrees. Drouhard explicitly mentions that its IRR aspect is used as a “measure for evaluating consistency between coders” and that in Drouhard, they “we use IRR (specifically Cohen's Kappa) as a metric for agreement or consistent application of codes between two coders in order to measure improvement in consistency under various conditions.” (page 2, section 2.5). Furthermore, Drouhard even mentions that “Showing tweets {in real time} that are likely to be ambiguous or inconsistent between coders draws users' attention to these tweets and encourages a dialogue between coders to improve mutual understanding and consistent coding” (page 4, section 4.1). Also, “When a user indicates {in real time} that his/her partner's original code is correct, the system will ask {in real time} that user whether or not to change the assigned code to match the partner's code. This {real time} feedback loop can help coders become more consistent with each other over time, and it implicitly supports the iterative nature of qualitative coding (design objective 3)” (page 4, section 4.2). Lastly, In page 9, section 6.1, Drouhard delves into even further examples of “measuring inter-annotator consistency in real time.” 

Therefore, the Office respectfully asserts that the cited art sufficiently teaches the limitations recited in the amended claims.

Conclusion
THIS ACTION IS MADE FINAL.  Applicants are reminded of the extension of time policy as set forth in 37 C.F.R. § 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 C.F.R. § 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure.  Applicants are required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALVARO R CALDERON IV whose telephone number is (571)272-1818.  The examiner can normally be reached on Monday - Friday (9:30am - 6:00pm).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kieu D. Vu can be reached on (571) 272-4057.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/ALVARO R. CALDERON IV
Examiner
Art Unit 2173




/KIEU D VU/Supervisory Patent Examiner, Art Unit 2173                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 In re Fulton, 391 F.3d 1195, 1201, 73 USPQ2d 1141, 1146 (Fed. Cir. 2004)