DETAILED ACTON
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to submission of application on 6/19/2019.
Claims 1-20 are presented for examination.
Drawings
The drawings submitted on 6/19/2019 are acceptable for the purpose of examination.
Specification
The disclosure is objected to for the following informality:
Applicant has not submitted a signed Oath or Declaration.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-11, 13 – 17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over You et al (Learning from Multiple Teacher Networks, herein You), Hinton et al (Distilling the Knowledge in a Neural Network, herein Hinton), and Avnimelech et al (Boosted Mixture of Experts: An Ensemble Learning Scheme).

Regarding claim 1, 
	You teaches a computer-implemented method for performing knowledge distillation, the method comprising: (You, Figure 1, and page 1285, column 1, paragraph 1, line 7 “In this paper, we present a method to train a thin deep network by incorporating multiple teacher networks not only in output layer by averaging the softened outputs (dark knowledge) from different networks, but also in the intermediate layers by imposing a constraint about the dissimilarity among examples.”

    PNG
    media_image1.png
    739
    1172
    media_image1.png
    Greyscale

In other words, method is computer-implemented method, and to train a thin deep network is for performing knowledge distillation.)
	obtaining, by one or more computing devices, an initial training dataset that comprises a set of training examples; (You, page 1287, column 2, paragraph 4, line 5 “Formally, NS parametrized by ΘS and a training dataset 
    PNG
    media_image2.png
    26
    124
    media_image2.png
    Greyscale
,…” In other words, training dataset is initial dataset that comprises a set of training examples.)   
	obtaining, by the one or more computing devices, a plurality of sets of outputs respectively produced for the set of training examples by a plurality of pre-trained machine- learned models, each of the plurality of pre-trained machine-learned models having been previously trained to perform a respective task based on a respective pre-trained model training dataset; (You, page 1285, column 1, paragraph 1, line 7 “In this paper, we present a method to train a thin deep network by incorporating multiple teacher networks not only in output layer by averaging the softened outputs (dark knowledge) from different networks, but also in the intermediate layers by imposing a constraint about the dissimilarity among examples.” In other words, multiple teacher networks is plurality of pre-trained machine-learned models, and train a thin deep network…. by incorporating multiple teacher networks is by a plurality of pre-trained machine-learned models. Examiner notes that in order to be used for training a thin deep network the teacher models, by necessity, must have been trained as well.)
[evaluating, by the one or more computing devices, a respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model;]
[determining, by the one or more computing devices, for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the set of outputs in a distillation training dataset based at least in part on the respective performance of such pre-trained machine-learned model; and]
training, by the one or more computing devices, a distilled machine-learned model using at least a portion of the distillation training dataset.  (You, page 1285, column 1, paragraph 1, line 7 “In this paper, we present a method to train a thin deep network by incorporating multiple teacher networks not only in output layer by averaging the softened outputs (dark knowledge) from different networks, but also in the intermediate layers by imposing a constraint about the dissimilarity among examples.” In other words, a thin deep network is distilled machine-learned model, and softened outputs is at least a portion of the distillation dataset.)
	Thus far, You does not explicitly teach evaluating, by the one or more computing devices, a respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model;
Hinton teaches evaluating, by the one or more computing devices, a respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model; (Hinton, page 8, paragraph 4, line 1 “We train a generalist model and then use the confusion matrix to define the subsets that the specialists are trained on.  Once these subsets have been defined the specialists can be trained entirely independently.  At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” And, page 8, paragraph 3, line 3 “At the same time as the experts are learning to deal with the examples assigned to them, the gating network is learning to choose which experts to assign each example to based on the relative discriminative performance of the experts for that example.”   In other words, choose which experts to assign each example to based on the relative discriminative performance is evaluating… a respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model.)
	Both You and Hinton are directed to transferring knowledge from teacher networks to a student network, also known as knowledge distillation. In view of the teaching of You, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Hinton into You.  This would result in being able to evaluate pre-trained machine-learned models (also known as specialists) based on their performance for the purpose of inclusion in the distillation training set.
	One of ordinary skill in the art would be motivated to do this because it would improve the accuracy of the model as well as the speed of training. (Hinton, page 1, paragraph 1, line 11 “We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.  Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.”)
	Thus far, the combination of You and Hinton does not explicitly teach determining, by the one or more computing devices, for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the set of outputs in a distillation training dataset based at least in part on the respective performance of such pre-trained machine-learned model; and
	Avnimelech teaches determining, by the one or more computing devices, for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the set of outputs in a distillation training dataset based at least in part on the respective performance of such pre-trained machine-learned model; and (Avnimelech, page 484, paragraph 2, line 1 “Other methods use dynamic linear combination models, using a confidence measure of the ensemble members regarding each pattern.  Different measures of the confidence of each predictor can be used for determining the relative contribution of each expert (Tresp & Taniguchi, 1995; Shimshoni & Intrator, 1996).” In other words, determining the relative contribution of each expert is determining…whether to include one or more outputs of the set of outputs, and using a confidence measure is based, at least in part, on the respective performance of the pre-trained machine-learned model.)
	Both Avnimelech and the combination of Hinton and You are directed at ensemble learning schemes.  In view of the teaching of Hinton and You it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Avnimelech into the combination of Hinton and You.  This would result in being able to evaluate outputs of pre-trained machine-learned models (also known as predictors, or teachers) for the purpose of determining their relative contribution to the distillation training set.
	One of ordinary skill in the art would be motivated to do this in order to improve the performance of the ensemble trained model by partitioning parts of the training set to different classifiers.  (Avnimelech, paragraph 4, line 6 “A different approach is training the classifiers on different parts of the training set, partitioned in a manner such that their distributions differ. Such an approach, which is presented here, combines two algorithms: boosting and mixture of experts (also known as BME).” And, page 494, paragraph 6, line 1 “The results indicate that the performance of an ensemble machine trained with the BME algorithm (and combined appropriately) is significantly better than a standard ensemble (parallel machine).)
Regarding claim 2,
	The combination of Avnimelech, Hinton and You teaches the computer-implemented method of claim 1,
	wherein the plurality of pre-trained machine-learned models comprise pre-trained classifier models configured to infer one or more classification labels for each training example as an output.  (You, Figure 1, and page 1285, column 2, paragraph 2, line 2 “The well-trained wide deep networks are naturally regarded as teachers, which have the capability of guiding the training of a new student network of the smaller size.” In other words, teachers are pre-trained machine-learned models that are pre-trained classifier models, configured to infer one or more classification labels for each training example as an output.)
Regarding claim 3,
	The combination of Avnimelech, Hinton and You teaches the computer-implemented method of claim 2, 
	wherein evaluating, by the one or more computing devices, the respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model comprises evaluating, by the one or more computing devices, the respective performance of each pre-trained machine-learned model on a per-classification label basis.  (Avnimelech, page 486, paragraph 1, line 2 “The gating function assigns probability to each of the experts based on the current input.  In the training stage, this value states the probability of a pattern’s appearing in an expert’s training set.  In the test stage, it defines the relative contribution of each expert to the ensemble.  The training attempts to achieve two goals: (1) for a given expert, find the optimal gating function, and (2) for a given gating function, train each expert to achieve maximal performance on the distribution assigned to it by the gating function.” and, page 487, paragraph 2, line 3 “In boosting, the first classifier is trained on all patterns, and the localization criterion for the distributions presented to the two other classifiers is the level of difficulty of the patterns as measured by classification performance.” In other words, gating function is evaluating, classification performance is respective performance where the respective performance is on a per-classification label basis.)
Regarding claim 4,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 3,
	wherein determining, by the one or more computing devices for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the plurality set of outputs in the distillation training dataset comprises determining, by the one or more computing devices for each pre- trained machine-learned model and for each classification label, whether to include in the distillation training dataset all outputs of the set of outputs with the classification label.  (Avnimelech, page 484, paragraph 2, line 1 “Other methods use dynamic linear combination models, using a confidence measure of the ensemble members regarding each pattern.  Different measures of the confidence of each predictor can be used for determining the relative contribution of each expert (Tresp & Taniguchi, 1995; Shimshoni & Intrator, 1996).” And, page 484, paragraph 3, line 6 “A different approach is training the classifiers on different parts of the training set, partitioned in a manner such that their distributions different.  Such an approach, which is presented here, combines two algorithms: boosting and mixture of experts.” In other words, using a confidence measure for each predictor to determine relative contribution is determining… whether to include one or more outputs in a distillation dataset based, at least in part, on the respective performance of the pre-trained machine-learned model, and training the classifiers on different parts of the training set is for each classification label.)
Regarding claim 5,
	The combination of Avnimelech, Hinton and You teaches the computer-implemented method of claim 3,
	wherein determining, by the one or more computing devices for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the plurality set of outputs in the distillation training dataset comprises, for each classification label: (Avnimelech, page 484, paragraph 2, line 1 “Other methods use dynamic linear combination models, using a confidence measure of the ensemble members regarding each pattern.  Different measures of the confidence of each predictor can be used for determining the relative contribution of each expert (Tresp & Taniguchi, 1995; Shimshoni & Intrator, 1996).” And, page 484, paragraph 3, line 6 “A different approach is training the classifiers on different parts of the training set, partitioned in a manner such that their distributions different.  Such an approach, which is presented here, combines two algorithms: boosting and mixture of experts.” In other words, determining relative contribution is determining… whether to include… one or more outputs of the plurality set of outputs, and training the classifiers on different parts is for each classification label.)
	selecting, by the one or more computing devices, a highest-performing pre-trained machine-learned model for such classification label; and (Hinton, page 8, paragraph 4, line 3 “At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” In other words, decide which specialists are relevant is selecting…a highest-performing pre-trained machine-learned model for such classification label.)
	including, by the one or more computing devices, in the distillation training dataset all outputs of the set of outputs generated by the highest-performing pre-trained machine-learned model that have such classification label.  (Hinton, page 6, paragraph 2, line 1 “When the number of classes is very large, it makes sense for the cumbersome model to be an ensemble that contains one generalist model trained on all the data and many “specialist” models, each of which is trained on data that is highly enriched in examples from a very confusable subset of the classes (like different types of mushroom).” And, page 6, paragraph 6, line 12 “The distribution pm is a distribution over all the specialist classes of m plus a single dustbin class, so when computing its KL divergence from the full q distribution we sum all of the probabilities that the full q distribution assigns to all the classes in m’s dustbin.” In other words, specialist models are highest performing pre-trained machine-learned models, and sum all of the probabilities that the full q distribution assigns is including all of the specialist outputs for their specifically assigned classification labels.)

Regarding claim 6, 
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 1,
	wherein evaluating, by the one or more computing devices, the respective performance of each pre-trained machine-learned model based at least in part on the set of outputs generated by the pre-trained machine-learned model comprises training, by the one or more computing devices using a validation dataset, one or more machine-learned trust models to evaluate the respective performance of each pre-trained machine-learned model. (Hinton, page 8, paragraph 4, line 3 “At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” In other words, decide which specialists are relevant is determining…whether to include one or more outputs of the plurality set of outputs in the distillation training dataset, and generalist model is trust model.)
Regarding claim 7,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 6,
	wherein determining, by the one or more computing devices for the set of outputs generated by each pre-trained machine-learned model, whether to include one or more outputs of the set of outputs in the distillation training dataset based at least in part on the respective performance of such pre-trained machine-learned model comprises, for each pre-trained machine-learned model: providing, by the one or more computing devices, each respective training example as an input into at least one of the one or more trust models; and (Hinton, page 6, paragraph 2, line 1, “When the number of classes is very large, it makes sense for the cumbersome model to be an ensemble that contains one generalist model trained on all the data and many “specialist” models, each of which is trained on data that is highly enriched in examples from a very confusable subset of the classes (like different types of mushroom).” In other words, generalist model trained on all the data is providing… each respective training example as an input into as least one of the one or more trust models.)
	receiving, by the one or more computing devices, an output from the at least one of the one or more trust models that indicates whether the corresponding output generated by the pre- trained machine-learned model for the respective training example should be included in the distillation training dataset.  (Hinton, page 6, paragraph 6, line 2 “In addition to the specialist models, we always have a generalist model so that we can deal with classes for which we have no specialists and so that we can decide which specialists to use.” In other words, deciding which specialists to use is indicates whether the corresponding output…should be included in the distillation training dataset.)
Regarding claim 8,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 6,
	wherein the initial training dataset comprises a first portion that is labeled and a second portion that is not labeled, and (You, page 1290, column 2, paragraph 4, line 1 “Remark 2. Like the generalized distillation [23], our method also can be extended to semi-supervised cases.  Since the losses related with teacher networks in objective Eq. (7) are both label-free, the numerous unlabeled examples can still be involved in the training.”  In other words, semi-supervised cases is a training dataset where a first portion is labeled and a second portion is not labeled.)
wherein the first portion of the initial training dataset is used as the validation dataset.  (You, page 1290, column 2, paragraph 4, line 7 “Then the labeled examples are capable of fine tuning the student network to further improve the performance.”  In other words, the labeled examples is the first portion, and fine tuning the student network is used as the validation set.)
Regarding claim 9,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 6,
	wherein the one or more machine- learned trust models each comprise a neural network.  (Hinton, page 8, paragraph 4, line 3 “At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” And, page 1, paragraph 1, line 3 “Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets.” In other words, the generalist model is trust model, and the generalist model is a neural network.)
Regarding claim 10,
	The combination of Avnimelech, Hinton and You teaches the computer-implemented method of claim 1,
	wherein evaluating, by the one or more computing devices, a respective performance of each pre-trained machine-learned model comprises: selecting, by the one or more computing devices, one or more expert models from the plurality of pre-trained machine-learned models by comparing a population statistic determined in part from the plurality of sets of outputs to the set of outputs determined by at least one of the pre-trained machine-learned models.  (Hinton, page 8, paragraph 4, line 3 “At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” And, page 2, paragraph 6, line 1, “Neural networks typically produce class probabilities by using a “softmax” output layer that converts the logit, zi, computed for each class into a probability, qi, by comparing zi with the other logits.

    PNG
    media_image3.png
    68
    611
    media_image3.png
    Greyscale

And, page 2, paragraph 3, line 1 “An obvious way to transfer the generalization ability of the cumbersome model to a small model is to use the class probabilities…” In other words, decide which specialists are relevant is selecting…by comparing, class probabilities is population statistic, and comparing zi with the other logits is determined in part… from the plurality of sets of outputs.)
Regarding claim 11,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 10
	wherein selecting, by the one or more computing devices, one or more expert models comprises comparing the population statistic to each set of outputs determined respectively by each pre-trained machine-learned model.  (Hinton, page 8, paragraph 4, line 3 “At test time we can use the predictions from the generalist model to decide which specialists are relevant and only these specialists need to be run.” And, page 2, paragraph 6, line 1, “Neural networks typically produce class probabilities by using a “softmax” output layer that converts the logit, zi, computed for each class into a probability, qi, by comparing zi with the other logits.

    PNG
    media_image3.png
    68
    611
    media_image3.png
    Greyscale

And, page 2, paragraph 3, line 1 “An obvious way to transfer the generalization ability of the cumbersome model to a small model is to use the class probabilities…” In other words, decide which specialists are relevant is selecting… one or more expert models, and use class probabilities is comparing the population statistic to each set of outputs.)
Regarding claim 13,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 1,
	wherein at least 50% of the initial comprises unlabeled or weakly labeled training examples.  (Hinton, page 2, paragraph 5, line 1 “The transfer set that is used to train the small model could consist entirely of unlabeled data [1] or we could use the original training set.”  In other words, consist entirely of unlabeled data is at least 50% of the initial consists of unlabeled or weakly labeled training examples.)
Regarding claim 14,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 1,
	wherein obtaining the plurality of sets of outputs comprises respectively performing inference on the set of training examples with the plurality of pre-trained machine-learned models.  (You, page 1285, column 1, paragraph 1, line 7 “In this paper, we present a method to train a thin deep network by incorporating multiple teacher networks not only in output layer by averaging the softened outputs (dark knowledge) from different networks, but also in the intermediate layers by imposing a constraint about the dissimilarity among examples.” In other words, softened outputs is sets of outputs, multiple teacher networks is plurality of pre-trained machine-learned models, and incorporating…output layer by averaging the softened outputs is generated by performing inference on the set of training examples.)
Claim 15 is a computing system claim corresponding to the computer-implemented method of claim 1. Otherwise, they are the same.  It is implicit that a computer-implemented method requires a computing system comprising one or more processors and one or more non-transitory computer-readable media that collectively store instructions for the one or more processors to execute in order to be executed.  Therefore, claim 15 is rejected for the same reasons as claim 1.
Claims 16, 17, and 19 are computing system claims corresponding to the computer-implemented method of claims 6, 14, and 10 respectively.  Otherwise, they are the same. Therefore, claims 16, 17, and 19 are rejected for the same reasons as claims 6, 14, and 10 respectively.
Claim 20 is a non-transitory computer-readable medium claim that corresponds to the computer-implemented method of claim 1.  Otherwise, they are the same.  It is implicit that a computer-implemented method requires one or more non-transitory computer-readable medium to store instructions in order to execute.  Therefore, claim 20 is rejected for the same reasons as claim 1.
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over You, Hinton, and Avnimelech, in view of Guyon et al (Feature Extraction, Foundations and Applications, herein Guyon).
Regarding claim 12,
	The combination of Avnimelech, Hinton, and You teaches the computer-implemented method of claim 10,
	Thus far the combination of Avnimelech, Hinton and You does not explicitly teach wherein determining, by the one or more computing devices, whether to include one or more outputs of the plurality of sets of outputs in the distillation training dataset is further based at least in part on a weighting applied to each of the sets of outputs generated by the one or more expert models.  
	Guyon teaches wherein determining, by the one or more computing devices, whether to include one or more outputs of the plurality of sets of outputs in the distillation training dataset is further based at least in part on a weighting applied to each of the sets of outputs generated by the one or more expert models.  (Guyon, Chapter 11, Ensembles of Regularized Least Squares Classifiers for High-Dimensional Problems, page 311, paragraph 6, Section 11.6.3 Ensemble Postprocessing, line 1 “A well-known avenue to improve the accuracy of an ensemble is to replace the simple averaging of individual experts by a weighting scheme.  Instead of giving equal weight to each expert, the outputs of more reliable experts are weighted up.  Linear regression can be applied to learn these weights.” In other words, using a weighting scheme is based at least in part on a weighting applied to each of the sets of outputs generated by the one or more expert models.)
	Both Guyon and the combination of Avnimelech, Hinton, and You are directed to using variations of ensemble methods as classifiers. In view of the teaching of the combination of Avnimelech, Hinton, and You, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Guyon into the combination of Avnimelech, Hinton, and You.  This would result in being able to improve the accuracy of an ensemble by using a weighting to the outputs.
	One of ordinary skill in the art would be motivated to do this to improve the accuracy of the ensemble (also known as pre-trained machine-learned models). (Guyon, page 311, paragraph 6, line 1 “A well-known avenue to improve the accuracy of an ensemble is to replace the simple averaging of individual experts by a weighting scheme.”)
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over You, Hinton, and Avnimelech, in view of Caruana et al (Ensemble Selection from Libraries of Models, herein Caruana).
Regarding claim 18,
	The combination of Avnimelech, Hinton, and You teach the computing system of claim 15,
	Thus far, the combination of Avnimelech, Hinton and You does not explicitly teach wherein obtaining the plurality of sets of outputs comprises accessing the plurality of sets of outputs from a database that stores previously generated inferences for the plurality of pre-trained machine-learned models.  
	Caruana teaches wherein obtaining the plurality of sets of outputs comprises accessing the plurality of sets of outputs from a database that stores previously generated inferences for the plurality of pre-trained machine-learned models.  (Caruana, page 1, column 1, line 1 “We present a method for constructing ensembles from libraries of thousands of models. Model libraries are generated using different learning algorithms and parameter settings.  Forward stepwise selection is used to add to the ensemble the models that maximize its performance.” In other words, libraries is database that stores previously generated inferences for the plurality of pre-trained machine-learned models.)
	Both Caruana and the combination of Avnimelech, Hinton, and You are directed to using some variation of ensemble to speed up inference and improve the accuracy of classifiers.  In view of the teaching of the combination of Avnimelech, Hinton, and You, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Caruana into the combination of Avnimelech, Hinton and You.  This would result in being able to select from libraries of machine learning models in order to build an accurate ensemble.
	One of ordinary skill in the art would be motivated to do so in order to improve the accuracy of the method by increasing the diversity of the ensemble. (Caruana, page 1, column 1, paragraph 2, line 3 “Dietterich (2000) states that “A necessary and sufficient condition for an ensemble of classifiers to be more accurate than any of its individual members is if the classifiers are accurate and diverse.”)
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359.  The examiner can normally be reached on Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124