DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-20 are pending.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim(s) 15 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
Claim(s) 15 is/are directed to "A non-volatile computer-readable storage medium". Applicant describes a computer readable medium by giving an open-ended description in the specification: e.g., [0086].  “A non-volatile computer-readable storage medium” is not explicitly or deliberately defined to include only the non-transitory embodiments in the specification. Moreover, the term " A non-volatile computer-readable storage medium" can encompass carrier waves. See, e.g. U.S. Patent No. 7,139,977 co1.3 11.20-24 ("In particular, the methods described herein may be implemented by a series of computer-executable instructions residing on a storage medium such as a carrier wave, disk drive, or computer-readable medium."), U.S. Pat. App. Pub. No. 2008/0165127 at paragraph [0071] ("Examples of the portable device or computer readable recording medium include ... storage media such as carrier waves (e.g., transmission through the Internet)."). The broadest reasonable interpretation of a claim drawn to a computer readable medium typically covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media. See Subject Matter Eligibility of Computer Readable Media, 1351 OG 212 (26 Jan 2010). See MPEP 2111.01. Signals are nothing but the physical characteristics of a form of energy, and as such is nonstatutory natural phenomena. See, e.g., In re Nuitjen, Docket no. 2006-1371 (Fed. Cir. Sept.20, 2007) (slip. op. at 18) ("A transitory, propagating signal like Nuitjen's is not a process, machine, manufacture, or composition of matter.' ... Thus, such a signal cannot be patentable subject matter."). 
Thus, claim(s) 15 is/are rejected under 35 U.S.C. 101 because, giving the claim(s) their broadest reasonable interpretation, the claimed "A non-volatile computer-readable storage medium" encompasses non-statutory subject matter. It is suggested to amend the recited “A machine-readable medium” to be "A non-transitory, non-volatile computer-readable storage medium" for overcoming the rejection.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more Claim(s) particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more Claim(s) particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 1-20 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim(s) 1, 8 and 15 recite limitations “extracting, by a computing device, a plurality of data sets from unlabeled data, each data set including a first preset number of data samples; for each data set, creating a plurality of sample sets by assigning labels to data samples in the data set, each sample set including the first preset number of data samples with respective labels,…” as recited in claim 1. Claims 8 and 15 recite similar limitations. The limitations are logically conflicting. According to the limitations, each data set in the unlabeled data includes a first preset number of data samples. Each data set further includes a plurality of sample sets. Yet, each sample set also includes the first preset number of data samples. When the data set contains more than two sample sets, the number of data samples in a data set would be different from the number of data samples in a sample set. The recited limitations are only true in the situation when the number of sample sets in a data set equals to one instead of more than one (plurality). The ambiguity introduced from the above limitation renders the claims indefinite.

Claim(s) 2-7, 9-14 and 16-20 is/are rejected under 112(b) for the same reason as given in their respective base claim(s).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), fourth paragraph:
Subject to the [fifth paragraph of 35 U.S.C. 112 (pre-AIA )], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim(s) 7 is/are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.

Claim(s) 7 is/are directed to “A classifier training method” and depend(s) on its/their respective base claim(s) 1. The base claim(s) is/are directed to “A data processing method”. The dependent claim(s) and the base claim(s) do not claim the same subject matters. The dependent claim(s) 7 fail(s) to further limit the subject matter of their respective base claim(s) upon which it/they depend(s).
Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
	In the case of placing claim 7 in proper dependent forms, an example for overcoming the present 112(d) rejection is given blow:

“7. The data processing method according to claim 1, further comprising:
repeatedly obtaining a batch of data samples from the unlabeled data and adding the sample sets in the candidate training set corresponding to the current batch of data samples to the labeled data 
training a current classifier by using the labeled data added with the sample sets the candidate training set corresponding to the current batch of data samples each time.”


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claim(s) 1, 5-8, 12-15 and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Homma et al (US2011/0295778).

Regarding claims 1, 8 and 15, Homma teaches a data processing method, including:
extracting, by a computing device, a plurality of data sets from 
for each data set, creating a plurality of sample sets by assigning labels to data samples in the data set,
	each sample set including the first preset number of data samples with respective labels,
(Examiner’s note: see the 112(b) rejection on claim 1. In this office action, the recited limitations “each data set including a first preset number of data samples; for each data set, creating a plurality of sample sets by assigning labels to data samples in the data set” are interpreted as “each data set including at least a first preset number of data samples; for each data set, creating one or more 
	the labels of the data samples in each sample set constituting a label combination, and
	label combinations corresponding to different sample sets being different from each other;
(Homma, Fig. 7; sum of P_k1, P_k2 and P_k3 can be unlabeled data (known data which can be further labeled); scenario 1: each of P_k1, P_k2 andP_k3 can be considered as a data set (DS_k1, DS_k2 and DS_k3) and a sample set (SS_k1, SS_k2, and SS_k3); scenario 2: (P_k1 + P_k2) can also be can be considered as a single data set (DS_k1_k2) which contains two sample sets (SS_k1 and SS_k2); learning samples L1, L2 and L3 extracted from respective sample set SS_k1, SS_k2 or SS_k3 can be a same preset number Lp; each data set contains at least Lp samples; “The data pool generation section may further generate a known data pool which contains, among the data included in the data group, known data in which the class to be classified into is known and has a label of the class into which the known data is classified. The learning sample collection section may further randomly extract a predetermined number of pieces of the data from the known data pool having the label and may collect a learning sample containing the extracted data”, [0012]; “pieces of known data are classified into classes having labels of “camera”, “leopard”, and “watch”, respectively”, [0055]; i.e., in scenario 2, SS_k1 and SS_k2 from DS_k1_k2 can be labeled as “camera” and “leopard”, respectively; similarly, SS_k3 from DS_k3 can be labeled as “watch”; these labels can form various label combinations, e.g., (camera, leopard), (camera, watch), etc.)
respectively training, for each sample set created from the data set, a classifier by using the sample set and labeled data;
(Homma, Fig. 5, “the classifier generation section 130 generates multiple classifiers from the multiple learning samples L which have been collected (Step S105)”, [0072]; training a classifier is part of creating the classifier)
(While Homma does not expressly disclose how P_k1, P_k2, P_k3, etc., are labeled, Homma does indicate that any unknown data (unlabled data may be labeled using unsupervised cluster analysis, “The classification section 160 classifies unknown data included in the data group into any one of a predetermined number of classes based on the output feature quantity. Here, for the classification of the unknown data, there may be used a technique of unsupervised classification such as a cluster analysis”, [0048]; it would be obvious that the data set P_k1, P_k2, P_k3 may be labeled using unsupervised cluster analysis to become known data set or labeled data set for initial data labeling)
obtaining a sample set that corresponds to a trained classifier with the highest performance among the plurality of sample sets created from the data set; and
(Homma, Fig. 3 “illustrating a feature quantity of unknown data in a feature quantity space S1”, [0058]; feature quantity space S1 may contain multiple features; Fig. 4, the classification or learning process classifies different features from feature quantity space S1 into individual feature quantity spaces S1a, S1b,… S1f, [0058-0059]; for classifying feature “bonsai”, the classifier in feature quantity space S1a has the best performance compared to other classifiers; so this classifier is chosen for classifying feature “bonsai” and all other classifiers (S1b-S1f) will not be proper)
adding the obtained sample set to a candidate training set; and
adding, by the computing device, a second preset number of sample sets in the candidate training set to the labeled data.
(Homma, generating/training a mix classifier using both known data set (labeled data set) and unknown data set for classifier learning accuracy improvement, [0085]; which is a kind of semi-supervised learning process; Fig. 7, after the mix classification, sample set Ln from the unknown data set P_u becomes a new known sample set and obviously the new known sample set, i.e., the candidate data set, may all be included in the known data set for classifying other unknown data)

Regarding claims 5 and 19, Homma teaches its/their respective base claim(s).
Homma further teaches the data processing method according to claim 1, wherein the second preset number of sample sets are a total number of sample sets in the candidate training set.
(Homma, generating/training a mix classifier using both known data set (labeled data set) and unknown data set for classifier learning accuracy improvement, [0085]; which is a kind of semi-supervised learning process; Fig. 7, after the mix classification, sample set Ln from the unknown data set P_u becomes a new known sample set and obviously the new known sample set, i.e., the candidate data set, may all be included in the known data set for classifying other unknown data)

Regarding claims 6, 13 and 20, Homma teaches its/their respective base claim(s).
Homma further teaches the data processing method according to claim 1, wherein extracting the plurality of data sets from unlabeled data comprises:
clustering the unlabeled data to obtain a plurality of clusters of the unlabeled data; and
forming a data set by extracting one or more data samples from each cluster of the unlabeled data and forming the data set having the first preset number of data samples using the extracted one or more data samples.
(Homma, “The classification section 160 classifies unknown data included in the data group into any one of a predetermined number of classes based on the output feature quantity. Here, for the classification of the unknown data, there may be used a technique of unsupervised classification such as a cluster analysis”, [0048])

Regarding claims 7 and 14, Homma teaches its/their respective base claim(s).
Homma further teaches the classifier training method, comprising:
repeatedly obtaining a batch of data samples from the unlabeled data and adding the sample sets in the candidate training set corresponding to the current batch of data samples to the labeled data by using the data processing method according to claim 1; and
training a current classifier by using the labeled data added with the sample sets the candidate training set corresponding to the current batch of data samples each time.
(Homma, generating/training a mix classifier using both known data set (labeled data set) and unknown data set for classifier learning accuracy improvement, [0085]; which is a kind of semi-supervised learning process; Fig. 7, after the mix classification, sample set Ln from the unknown data set P_u becomes a new known sample set and obviously the new known sample set, i.e., the candidate data set, may all be included in the known data set for classifying other unknown data; “The learning sample collection may be performed by repeating the following processing: extracting a predetermined number of pieces of data contained in any one of the data pools, the predetermined number being sufficient for generating a classifier in the succeeding processing; and setting the extracted predetermined number of pieces of data as one learning sample L”, [0067])

Regarding claim 12, Homma teaches its/their respective base claim(s).
Homma further teaches the data processing device according to claim 8, wherein the one or more processors are further configured to add all the sample sets in the candidate training set to the labeled data.
(Homma, generating/training a mix classifier using both known data set (labeled data set) and unknown data set for classifier learning accuracy improvement, [0085]; which is a kind of semi-supervised learning process; Fig. 7, after the mix classification, sample set Ln from the unknown data set P_u becomes a new known sample set and obviously the new known sample set, i.e., the candidate data set, may all be included in the known data set for classifying other unknown data)

Claim(s) 2-3, 9-10 and 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Homma et al (US2011/0295778) in view of Song et al (US9275345).

Regarding claims 2, 9 and 16, Homma teaches its/their respective base claim(s).
Homma does not expressly disclose but Song teaches the data processing method according to claim 1, wherein before respectively training, for each sample set created from the data set, the classifier by using the sample set and the labeled data, the method further comprises:
dividing the labeled data into a training set for classifier training and a testing set for classifier testing according to a preset ratio.
(Song, “The experiment is set up as one-vs.-all classification where each user's data was randomly split into training and testing sets, at 80% to 20% ratios”, c12: 45-50; user’s data may be the labeled data for validating a classifier; any ratio between 80%-20% is a preset ratio)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Song into the system or method of Homma in order to validate a classifier after it is trained. The combination of Homma and Song also teaches other enhanced capabilities.

Regarding claims 3, 10 and 17, the combination of Homma and Song teaches its/their respective base claim(s).
The combination further teaches the data processing method according to claim 2, wherein respectively training, for each sample set created from the data set, the classifier comprises:
respectively adding each sample set created from the data set to the training set from the labeled data to form multiple new training sets; and
training multiple classifiers using the multiple new training sets respectively.
(Homma, Figs. 5 and 7, generating/training multiple classifiers from multiple data sets (known and unknown data sets), [0070, 0072-0073])

Claim(s) 4, 11 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Homma et al (US2011/0295778) in view of Song et al (US9275345) and further in view of Beers et al (US2015/0206069).

Regarding claims 4, 11 and 18, the combination of Homma and Song teaches its/their respective base claim(s).
The combination does not expressly disclose but Beers teaches the data processing method according to claim 3, wherein obtaining the sample set that corresponds to the trained classifier with the highest performance comprises:
calculating an AUC (Area Under Curve) value of each of the multiple classifiers trained by the multiple new training sets respectively, each AUC value corresponding to a sample set that is created from the data set and is included in one of the multiple new training sets used to train one of the multiple classifiers; and
obtaining a sample set corresponding to the highest AUC value among the plurality of sample sets created from the data set as the sample set whose corresponding classifier has the highest performance.
(Beers, Fig. 8A, “At S4, a heuristic search method, such as ANN, is used to generate a first set of binary classifiers. Iteratively, the ANN model is modified, at S5, by changing a number of hidden layers. This second set of binary classifiers is then compared with the first set with reference to a cost function, such as an area under a curve (AUC) of a ROC at S6”, [0065]; the AUC area may be used to group similar features in a data set of Homma ([0083-0084])
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the teachings of Beers into the modified system or method of Homma and Song in order to identify similar features in a data pool using the AUC technique. The combination of Homma, Song and Beers also teaches other enhanced capabilities.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIANXUN (JAMES) YANG whose telephone number is (571)272-9874. The examiner can normally be reached on MON-FRI: 8AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on (571)272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/JIANXUN YANG/Primary Examiner, Art Unit 2664                                                                                                                                                                                                        
11/30/2022