Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 2019-07-17 is being considered by the examiner.
Specification
The disclosure is objected to because of the following informalities: Specification Para [0061] recites:  “In this case, sinS are the seed tokens of the annotation categories, and the Dist(seeds, tweet) function represents the sth element of the vector”.    “sinS” has no antecedent basis, but Examiner suspects this is intended to read “seeds”.  Appropriate correction is required.
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. The following title is suggested: “Multi-View Classifier Training with Weighted Consensus Labels”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four not limited to” these non-transitory devices, the claim may include non-statutory subject matter such as signals and transmissions.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4, 5, 8, 11, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Blum et. al. (“Combining Labeled and Unlabeled Data with Co-Training”; hereinafter Blum) in view of Cordeiro Junior et. al. (“A PSO algorithm for Improving Multi-View Classification”; hereinafter Cordeiro).
As per Claim 1, Blum teaches a method of training a classifier (Blum, Section 6, discloses:  “In order to test the idea of co-training, we applied it to the problem of learning to classify web pages”.  Here, Blum discloses training (“co-training”) a classifier (“classify web pages”)).
receiving labeled input data and unlabeled input data (Blum, Table 1, discloses:  

    PNG
    media_image1.png
    359
    732
    media_image1.png
    Greyscale

Here, Blum discloses receiving (“given”) labeled input data (“a set L of labeled training examples”) and unlabeled input data (“a set U of unlabeled examples”)).
	extracting, from the labeled input data, a first set of features belonging to a first feature space (Blum, Abstract, discloses:  “In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the task of learning to classify web pages. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page.”  Here, Blum discloses a first feature space (“words occurring on that page”, which is one of “two distinct views”).  Blum, Abstract, continues:  “We assume that either view of the example would be sufficient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment a much smaller set of labeled examples. Specifically, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm's predictions on new unlabeled examples are used to enlarge the training set of the other.” Here, Blum discloses training based on the data that is belonging to a first feature space (“view”), and in order to use the data for training, the data must be extracted for use.  Thus, Blum discloses extracting a first set of features belonging to a first feature space.  Blum also discloses labeled input data (“a smaller set of labeled examples”)).  Therefore, Blum discloses extracting, from the labeled input data, a first set of features belonging to a first feature space.)
	extracting, from the labeled input data, a second set of features belonging to a second feature space different from the first feature space  (Blum, Abstract, discloses:  “In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the task of learning to classify web pages. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page.”  Here, Blum discloses a second feature space (“words occurring in hyperlinks that point to that page”, which is one of “two distinct views”).  Blum, Abstract, continues:  “We assume that either view of the example would be sufficient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment a much smaller set of labeled examples. Specifically, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm's predictions on new unlabeled examples are used to enlarge the training set of the other.” Here, Blum discloses training based on the data that is belonging to a second feature space (“view”), and in order to use the data for training, the data must be extracted for use.  Thus, Blum discloses extracting a second set of features belonging to a second feature space.  Blum also discloses labeled input data (“a smaller set of labeled examples”)).  Therefore, Blum discloses extracting, from the labeled input data, a second set of features belonging to a second feature space.  Blum also discloses “two distinct views”, and thus discloses a second feature space different from the first feature space.)
	training a first classifier using the first feature set and applying the trained first classifier to the unlabeled input data to predict a first label (Blum, Table 1, discloses:

    PNG
    media_image1.png
    359
    732
    media_image1.png
    Greyscale

Here, Blum discloses training a first classifier using the first feature set (“Use L to train a classifier h1 that considers only the x1 portion of x”, wherein L is “labeled” training examples of x1 and thus the first feature set).  Blum also discloses applying the trained first classifier to the unlabeled input data to predict a first label, as they disclose “Allow h1 to label p positive and n negative examples from U’”, wherein h1 is the first classifier and U’ is unlabeled input since it is chosen from U, which is “a set U of unlabeled examples”.    This is used to predict a first label (“label p positive and n negative examples”)).
	training a second classifier using the second feature set and applying the trained second classifier to the unlabeled input data to predict a second label (Blum, Table 1, discloses:

    PNG
    media_image1.png
    359
    732
    media_image1.png
    Greyscale

Here, Blum discloses training a second classifier using the second feature set (“Use L to train a classifier h2 that considers only the x2 portion of x”, wherein L is “labeled” training examples of x2 and thus the second feature set).  Blum also discloses applying the trained second classifier to the unlabeled input data to predict a second label, as they disclose “Allow h2 to label p positive and n negative examples from U’”, wherein h2 is the second classifier and U’ is unlabeled input data since it is chosen from U, which is “a set U of unlabeled examples”.    This is used to predict a second label (“label p positive and n negative examples”)).
	expanding the labeled input data with supplementary unlabeled data and its [consensus] label (Blum, Table 1, discloses:

    PNG
    media_image1.png
    359
    732
    media_image1.png
    Greyscale

Here, Blum discloses “Add these self-labeled examples to L”, wherein supplementary unlabeled data with its predicted label (“self-labeled examples”, originally from unlabeled set U) are used for expanding the labeled input data (“Add…to L”, wherein L is the set of labeled data)). *Consensus label taught by Cordeiro below.
	retraining at least one of the first classifier and the second classifier based on a training example comprising the expanded labeled input data and the [consensus] label (Blum, Table 1, discloses:

    PNG
    media_image1.png
    359
    732
    media_image1.png
    Greyscale

Here, Blum discloses expanded labeled input data with its predicted label (“Add these self-labeled examples to L”).  Blum also discloses “Loop for k iterations”.  If one goes back to the beginning of the loop, it returns to “Use L to train a classifier h1” and “Use L to train a classifier h2”. Therefore, Blum discloses retraining at least one of the first classifier and the second classifier based on a training example comprising the expanded labeled input data.) *Consensus label taught by Cordeiro below.
	However, Blum does not teach extracting, from the labeled input data, a third set of features belonging to a third feature space different from the first feature space and the second feature space; training a third classifier using the third feature set and applying the trained third classifier to the unlabeled input data to predict a third label; identifying a consensus label for the unlabeled input data based on the first label, the second label, and the third label; expanding the labeled input data with supplementary unlabeled data and its consensus label; retraining at least one of the first classifier and the second classifier based on a training example comprising the expanded labeled input data and the consensus label.
Cordeiro teaches extracting[, from the labeled input data,] a third set of features belonging to a third feature space different from the first feature space and the second feature space (Recall that above Blum discloses labeled input data.  Cordeiro, Fig. 1, discloses 

    PNG
    media_image2.png
    206
    410
    media_image2.png
    Greyscale

Here, Cordeiro discloses three separate “views” (feature spaces), and specifically a third feature space (“Image”) which is different from the first feature space and the second feature space (“Audio” and “Subtitles”))
[training a] third classifier using the third feature set and [applying the trained third classifier to the unlabeled input data to predict a third label ] (Recall that above Blum discloses training a classifier and applying a classifier to unlabeled input data to predict a label.  Cordeiro, Fig. 1, discloses 

    PNG
    media_image2.png
    206
    410
    media_image2.png
    Greyscale

Here, Cordeiro discloses a third classifier (“Classifier 3”) using the third feature set (“Image”))
[unlabeled] input data based on the first label, the second label, and the third label (Recall that Blum above discloses unlabeled input data.  Cordeiro, Fig. 1, discloses 

    PNG
    media_image2.png
    206
    410
    media_image2.png
    Greyscale

Here, Cordeiro discloses identifying a consensus label (“Class”) for the input data (“Video”) based on the first label, the second label, and the third label (Phi1, Phi2, Phi3 wherein each Phi is the output of one of the first, second, and third Classifiers, and the output of a Classifier is a label)).
Blum and Cordeiro are analogous art because they are both in the field of endeavor of machine learning.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the co-training of Blum, with the weighted consensus label of Cordeiro. The modification would have been obvious because one of ordinary skill in the art would be motivated to improve the accuracy of a classifier (Cordeiro, Abstract: “This paper proposes a PSO algorithm to combine the outputs coming from different views. It also considers that some views may be better at classifying specific classes, and provides weighting schemes for both views and classes. Experiments were performed in two datasets with three views each, and 

As per Claim 4, the combination of Blum and Cordeiro teaches the method of claim 1.  Cordeiro teaches wherein identifying the consensus label comprises: weighting each of the first label, second label, and third label according to respective weights associated with the first, second, and third classifier to produce weighted votes for each unique label (Cordeiro, Fig. 1, discloses 

    PNG
    media_image2.png
    206
    410
    media_image2.png
    Greyscale

Here, Cordeiro discloses identifying the consensus label (“Class”), as well as first label, second label, and third label (Phi1, Phi2, Phi3) and first, second, and third classifier (Classifier1, 2, and 3).  Cordeiro, Section 3 Para 4, discloses:  “Following this basic principle, here we propose a PSO algorithm for weighting classifications coming from different data views, from now on referred as PSO-WV. In a second step, the PSO is modified to also provide weights for classes in specific views, by estimating how good a view is to predict a class, generating PSO-WC. The next sections describe the three main components of the PSO modified to implement PSO-WV and PSO-WC: (i) particle representation, (ii) fitness function, and (iii) the mechanisms for velocity and position update. Finally, we show how the final vector of weights is used to determine the class of a new test example. Algorithm 23 underlines the overall procedure.”  Here, Cordeiro discloses weighting each label (“weighting classifications coming from different data views”).  Cordeiro, Section 3 A Last Paragraph, discloses:  “The role of the weights Wi and Wij is to modify the confidence of the classifiers in predicting the class of an instance e according to this profile. By confidence we mean any metric that assesses the “quality” of a prediction. Na¨ıBayes, for example, returns a list of probabilities of the new example to belong to each class in the database. When weighting views, the weight Wi will modify the probability value of the winner class (the biggest returned probability will correspond to -), while Wij will weight, for each view, each probability value in the returned probability list. More details are given in Section III-D.”  Here, Cordeiro discloses weights associated with each classifier (“The role of the weights Wi and Wij is to modify the confidence of the classifiers”).  Cordeiro, Section 2 B Last Paragraph, discloses:  “In the context of weighting classifiers, [23] presented a PSO to compute weights for combining multiple neural network classifiers. The weights were obtained so that they minimize the total classification error rate of the ensemble system. In [15] we found the most similar work to ours. [15] modeled a PSO to weight the majority vote (WMV) of an ensemble of classifiers. Apart from working with MVL instead of ensembles, the strategy adopted here differs from ours in two main points: (i) [15] weights votes, we weight the classifier outputs before performing the voting process, (ii) the weights generated by the PSO can also take into account the classes in each view, while [15] weights only the outputs of the predicted class. Hence, the PSO can enforce or discourage a prediction if that classifier is not the best in that view.”  Here, Cordeiro discloses weighting each label according to respective weights associated with each classifier to produce weighted votes for each unique label, as Cordeiro discloses “we weight the classifier outputs before performing the voting process”.  Here, the “classifier outputs” are each label, and they are weighted according to weights associated with each classifier (“weight the classifier outputs”), and after the classifiers are weighted, voting is performed, thus they produce weighted votes for each unique label.  
	selecting the unique label having a highest weighted vote  (Cordeiro, Section 3 A Last Paragraph, discloses:  “The role of the weights Wi and Wij is to modify the confidence of the classifiers in predicting the class of an instance e according to this profile. By confidence we mean any metric that assesses the “quality” of a prediction. Na¨ıBayes, for example, returns a list of probabilities of the new example to belong to each class in the database. When weighting views, the weight Wi will modify the probability value of the winner class (the biggest returned probability will correspond to -), while Wij will weight, for each view, each probability value in the returned probability list. More details are given in Section III-D.”  Here, Cordeiro discloses selecting the unique label (“winner class”) having a highest weighted vote (“biggest returned probability”)).
	
Claims 5 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Blum and Cordeiro in view of Tang et. al. (“Co-Tracking Using Semi-Supervised Support Vector Machines”; hereinafter Tang).
As per Claim 5, the combination of Blum and Cordeiro teaches the method of claim 4.  Cordeiro teaches further comprising generating weights for each of the first, second, and third classifier (Cordeiro, Fig. 1, discloses 

    PNG
    media_image2.png
    206
    410
    media_image2.png
    Greyscale

Here, Cordeiro discloses first, second, and third classifier (Classifier1, 2, and 3).  Cordeiro, Section 3 A Last Paragraph, discloses:  “The role of the weights Wi and Wij is to modify the confidence of the classifiers in predicting the class of an instance e according to this profile. By confidence we mean any metric that assesses the “quality” of a prediction. Na¨ıBayes, for example, returns a list of probabilities of the new example to belong to each class in the database. When weighting views, the weight Wi will modify the probability value of the winner class (the biggest returned probability will correspond to -), while Wij will weight, for each view, each probability value in the returned probability list. More details are given in Section III-D.”  Here, Cordeiro discloses weights associated with each classifier (“The role of the weights Wi and Wij is to modify the confidence of the classifiers”).  
	However, Cordeiro does not explicitly teach generating the classifier weights based on respective performances of the classifiers against an annotated dataset.  
	Tang teaches generating the classifier weights based on respective performances of the classifiers against an annotated dataset.   (Tang, Section 3.1.2, discloses:  “In order to combine trained classifiers into a final classifier we must assign a weight to each of them. Logically, this weight should be based on the accuracy of each classifier. We therefore adapt the concept from AdaBoost [11] of determining the weight of a classifier based on its error on a
labeled validation set.”  Here, Tang discloses generating the classifier weights (“combine trained classifiers…assign a weight to each of them”) based on respective performances of the classifiers against an annotated dataset (“based on its error on a labeled validation set”)).
Blum, Cordeiro, and Tang are analogous art because they are all in the field of endeavor of machine learning.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the co-training with consensus label of Blum and Cordeiro, with the classifier weighting on labeled data of Tang. The modification would have been obvious because one of ordinary skill in the art would be motivated to minimize errors (Tang, Section 3.1.2: “determining the weight of a classifier based on its error on a labeled validation set”).

As per Claim 8, Claim 8 is a system claim corresponding to method Claim 1.  The difference is that it recites an interface for receiving data, a memory, a feature extraction module, and a prediction consensus generation module.  (Cordeiro, Section 4, discloses:  “This section presents the results obtained by PSO-WV and PSO-WC in two datasets from different application domains: the ACM-DL1 (Digital Library of the Association for Computing Machinery), which deals with document classification, and an Online Video Social Network database [2], where each instance represents a YouTube user.”  Here, Cordeiro discloses using data from databases, which must be done via a computer interface for receiving data, and the computer must also contain a memory.  Cordeiro, Fig 1, also discloses:

    PNG
    media_image3.png
    348
    878
    media_image3.png
    Greyscale
Claim 8 is rejected for the same reasons as Claim 1.

As per Claim 11, Claim 11 is a system claim corresponding to method Claim 4.   Claim 11 is rejected for the same reasons as Claim 4.

As per Claim 12, Claim 12 is a system claim corresponding to method Claim 5.   Claim 11 is rejected for the same reasons as Claim 5.

As per Claim 15, Claim 15 is a computer-readable medium claim corresponding to method Claim 1.  The difference is that it recites a computer-readable medium. (Cordeiro, Section 4, discloses:  “This section presents the results obtained by PSO-WV and PSO-WC in two datasets from different application domains: the ACM-DL1 (Digital Library of the Association for Computing Machinery), which deals with document classification, and an Online Video Social Network database [2], where each instance represents a YouTube user.”  Here, Cordeiro discloses using data from databases, and thus reading data with a computer-readable medium.)  Claim 15 is rejected for the same reasons as Claim 1.

Claims 6, 7, 13, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Blum and Cordeiro in view of Tang et. al. (“Co-Tracking Using Semi-Supervised Support Vector Machines”; hereinafter Tang).
As per Claim 6, the combination of Blum and Cordeiro teaches the method of Claim 1.  Cordeiro teaches third set of features (Cordeiro, Fig. 1, discloses three “views”).
However, Cordeiro does not explicitly teach features selected from the group consisting of lexical features, semantic features, and distribution-based features.
Mihalcea discloses features selected from the group consisting of lexical features, semantic features, and distribution-based features (Mihalcea, Section 3.4 discloses: “Several basic word sense disambiguation classifiers can be implemented using feature combinations from Table 1, and feature vectors can be plugged into any learning algorithm. We use Naive Bayes, since it was previously shown that in combination with the features we consider, can lead to a state-of-the-art disambiguation system (Lee and Ng, 2002). Moreover, Naive Bayes is particularly suitable for co-training and self-training, since it provides confidence scores and is efficient in terms of training and testing time. The two separate views required for co-training are defined using a local versus topical feature split. For self-training, a global classifier with no feature split is defined. 
A local classifier 
A local classifier was implemented using all local features listed in Table 1.
A topical classifier
The topical classifier relies on features extracted from a large context, in particular keywords specific to each individual sense. We use the SK feature, and extract at most ten keywords for each word sense, each occurring for at least three times in the annotated corpus. 
A global classifier
Finally, the global classifier integrates all local and topical features, also in a Naive Bayes classifier. This classifier is basically a combination of the previous two local and topical classifiers.”
Here, Mihalcea discloses features selected from the group consisting of lexical features, semantic features, and distribution-based features, as Mihalcea discloses “The two separate views required for co-training are defined using a local versus topical feature split”.  The “local” features may be considered to be lexical features, as Mihalcea Table 1 lists “local” features such as “the word itself” and “the part of speech of the word”, as “lexical” means “relating to words or vocabulary of a language”.  The “Topical features” may be considered to be semantic features, as these are described as “features extracted from a large context” and semantic refers to meaning, or context.  Mihalcea therefore discloses features selected from lexical features and semantic features, and these fall within the group consisting of lexical features, semantic features, and distribution-based features.)

It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the co-training with consensus label of Blum and Cordeiro, with the lexical and semantic feature types of Mihalcea. The modification would have been obvious because one of ordinary skill in the art would be motivated to avoid errors from ambiguous words when language processing (Mihalcea, Section 3.4: “We use Naive Bayes, since it was previously shown that in combination with the features we consider, can lead to a state-of-the-art disambiguation system”).

As per Claim 7, the combination of Blum and Cordeiro teaches the method of Claim 1.  Blum teaches the first set of features and the second set of features, wherein the first set of features are different from the second set of features (Blum, Abstract, discloses:  “In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the task of learning to classify web pages. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page.”  Here, Blum discloses a first set of features (“words occurring on that page”) and a second set of features (“words occurring in hyperlinks that point to that page”), which comprise “two distinct views”, “distinct” implying that the second feature space is different from the first feature space.)
However, Cordeiro does not explicitly teach features selected from the group consisting of lexical features, semantic features, and distribution-based features.
Mihalcea discloses features selected from the group consisting of lexical features, semantic features, and distribution-based features (Mihalcea, Section 3.4 discloses: “Several basic word sense disambiguation classifiers can be implemented using feature combinations from Table 1, and feature vectors can be plugged into any learning algorithm. We use Naive Bayes, since it was previously shown that in combination with the features we consider, can lead to a state-of-the-art disambiguation system (Lee and Ng, 2002). Moreover, Naive Bayes is particularly suitable for co-training and self-training, since it provides confidence scores and is efficient in terms of training and testing time. The two separate views required for co-training are defined using a local versus topical feature split. For self-training, a global classifier with no feature split is defined. 
A local classifier 
A local classifier was implemented using all local features listed in Table 1.
A topical classifier
The topical classifier relies on features extracted from a large context, in particular keywords specific to each individual sense. We use the SK feature, and extract at most ten keywords for each word sense, each occurring for at least three times in the annotated corpus. 
A global classifier
Finally, the global classifier integrates all local and topical features, also in a Naive Bayes classifier. This classifier is basically a combination of the previous two local and topical classifiers.”
Here, Mihalcea discloses features selected from the group consisting of lexical features, semantic features, and distribution-based features, as Mihalcea discloses “The two separate views required for co-training are defined using a local versus topical feature split”.  The “local” features may be considered to be lexical features, as Mihalcea Table 1 lists “local” features such as “the word itself” and “the part of speech of the word”, as “lexical” means “relating to words or vocabulary of a language”.  The “Topical features” may be considered to be semantic features, as these are described as “features extracted from a large context” and semantic refers to meaning, or context.  Mihalcea therefore discloses features selected from lexical features, semantic features, and these are within the group consisting of lexical features, semantic features, and distribution-based features.)

As per Claim 13, Claim 13 is a system claim corresponding to method Claim 6.  The difference is that it recites an interface for receiving data, a memory, a feature extraction module, and a prediction consensus generation module.  (Cordeiro, Section 4, discloses:  “This section presents the results obtained by PSO-WV and PSO-WC in two datasets from different application domains: the ACM-DL1 (Digital Library of the Association for Computing Machinery), which deals with document classification, and an Online Video Social Network database [2], where each instance represents a YouTube user.”  Here, Cordeiro discloses using data from databases, which must be done via a computer interface for receiving data, and the computer must also contain a memory.  Cordeiro, Fig 1, also discloses:

    PNG
    media_image3.png
    348
    878
    media_image3.png
    Greyscale
Claim 13 is rejected for the same reasons as Claim 6.

As per Claim 14, Claim 14 is a system claim corresponding to method Claim 7.   Claim 14 is rejected for the same reasons as Claim 7.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Xu et. al. (“A Survey on Multi-view Learning”) preesnts an overview of many multi-view learning techniques, including co-training
Yu et. al. (“Bayesian Co-Training”) discloses a form of co-training in which some views are weighted more if they are better at predicting output than the others
Tanha et. al. (“Disagreement-Based Co-Training”) discloses a form of co-training that involves voting on labels for unlabeled data 
Ng et. al. (“Weakly Supervised Natural Language Learning Without Redundant Views”) discloses using majority voting to predict the label of a test instance
Amini et. al. (“A Co-classification Approach to Learning from Multilingual Corpora”) discloses an algorithm called co-classification that incorporates a disagreement cost
Gibaja et. al. (“An ensemble-based approach for multi-view multi-label classification”) discloses an multi-classification algorithm where view are fused at the decision level based on majority voting
Zhang et. al. (US 2011/0119210 A1) discloses multiple category learning with a plurality of classifiers with a winner-take-all multiple category boosting algorithm
Munro et. al. (US 2016/0162456 A1) Para [0159] discloses ensemble or voting methods, including a majority vote of the models to determine a label
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710.  The examiner can normally be reached on M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/L.A.S./Examiner, Art Unit 2126    
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126