DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1 – 9 are pending in this Office correspondence.
Double Patenting
Claim 1 – 9 of this application is patentably indistinct from claims 1 – 7 of Application No. 16/226,394, now U.S. Patent 11,087,179. Pursuant to 37 CFR 1.78(f) or pre-AIA  37 CFR 1.78(b), when two or more applications filed by the same applicant contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).

A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).

The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

The subject matter claimed in the instant application is fully disclosed in the co-pending application and is covered by the co-pending application since the co-pending application and the application are claiming common subject matter, as follows: 

  Instant application 17/396,503
Co-pending Application 16/226,394
1. A system, comprising: 
a trained multi-label support vector machine running a one-vs-the-rest classifier, 
wherein the trained multi-label support vector machine running the one-vs-the-rest classifier is configured with 
trained parameters that are learned from training the trained multi-label support vector machine running the one-vs-the-rest classifier on document features of documents belonging to a plurality of label classes, and 
hyperplane determinations on label classes in the plurality of label classes, and 
wherein the trained parameters include distributions of distances between the label classes and the hyperplanes.

2. The system of claim 1, further configured to apply the trained multi-label support vector machine running the one-vs-the-rest classifier on multi-label classifications of documents.

3. The system of claim 1, wherein the document features represent frequencies or semantics of words in a document.

4. The system of claim 1, wherein the document features include frequency features based on term frequency-inverse document frequency (TF-IDF).

5. The system of claim 1, wherein the document features include semantic features based on embedding in a multi-dimensional vector space using Word2Vec.

6. The system of claim 1, wherein the document features include semantic features based on embedding in a multi-dimensional vector space using global vectors for word representation (GloVe).



7. The system of claim 1, further configured to select hyperparameters of the trained multi- label support vector machine running the one-vs-the-rest classifier across regularization, class weight, and loss function in a predetermined search range such that an at-least-one score is at or within ten percent of maximum attainable over the predetermined search range.

8. The system of claim 7, wherein the at-least-one score calculates a ratio of count of the documents with at least one pairwise match between inferred labels and ground truth labels to a total number of documents with at least one ground truth label.

9. The system of claim 1, wherein one of the label classes is parked domain, for documents posted on parked domains, further configured to: identify parked domains and collect documents posted on the parked domains by crawling uniform resource locators (URLs) that are within a predetermined edit distance of selected URL names, determining for at least some of the crawled URLs that URL resolution is referred to an authoritative nameserver that appears in a list of parked domain nameservers identified as dedicated to parked domains, and collecting the documents posted on the crawled URLs that are referred to the parked domain nameservers; and label the collected documents as collected from the parked domains and store the documents and parked domain labels for use in training.
1. A method of training a multi-label support vector machine (abbreviated SVM) running a one-vs-the-rest (abbreviated OVR) classifier, including: 
accessing training examples for documents belonging to 50 to 250 label classes; 

creating document features representing frequencies or semantics of words in a document; 


training an SVM using the document features for one-vs-the-rest training and hyperplane determinations on the label classes; and 



storing parameters of the trained SVM on the label classes, including distributions of distances between the 50 to 250 label classes and the hyperplanes, for use in production of multi-label classifications of documents.





2. The method of claim 1, wherein the document features include frequency features based on term frequency-inverse document frequency (abbreviated TF-IDF).

3. The method of claim 1, wherein the document features include semantic features based on embedding in a multi-dimensional vector space using Word2Vec.

4. The method of claim 1, wherein the document features include semantic features based on embedding in a multi-dimensional vector space using global vectors for word representation (abbreviated GloVe).

5. The method of claim 1, further including, selecting SVM hyper parameters across regularization, class weight, and loss function in a predetermined search range such that an at-least-one (abbreviated ALO) score is at or within ten percent of maximum attainable over the predetermined search range.



6. The method of claim 5, wherein the ALO score calculates a ratio of count of the documents with at least one pairwise match between inferred labels and ground truth labels to the total number of documents with at least one ground truth label.

7. The method of claim 1, wherein one of the label classes is parked domain, for documents posted on parked domains, further including: identifying parked domains and collecting documents posted on the parked domains, including: crawling uniform resource locators (abbreviated URLs) that are within a predetermined edit distance of selected URL names; determining for at least some of the crawled URLs that URL resolution is referred to an authoritative nameserver that appears in a list of parked domain nameservers identified as dedicated to parked domains; and collecting the documents posted on the crawled URLs that are referred to the parked domain nameservers; and labelling the collected documents as collected from the parked domains and storing the documents and parked domain labels for use in training.









Claims 1 – 9 are rejected under the judicially created doctrine of obviousness-type double patenting as being unpatentable over claims 1 – 7 of co-pending application 16/226,394, now U.S. Patent 11,087,179.  Although the conflicting claims are not identical, they are not patentably distinct from each other because of corresponding language that recites virtually all of the same elements and functions claimed in the claim 1 of instant application and claim 1 of  the copending invention, e.g., “the trained multi-label support vector machine running the one-vs-the-rest classifier is configured with trained parameters that are learned from training the trained multi-label support vector machine running the one-vs-the-rest classifier on document features of documents belonging to a plurality of label classes, and hyperplane determinations on label classes in the plurality of label classes.” 
The claimed differences would be obvious to a programmer of ordinary skill because the instant claims are merely broader and/or alternate variations of the claims recited in the co-pending application.
Because the instant claims merely add/modify the additional elements from the set of elements and functions claimed in the parent application, such modifications would be readily apparent to a programmer of ordinary skill.
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to omit/add/modify the additional elements of claim 1 to arrive at the claim 1 of the instant application because the person would have realized that the remaining element would perform the same functions as before.
It would have been obvious to modify instant claims in order to ensure efficient implementation of enterprise policies to filter out inappropriate websites. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1 – 8 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent 7,835,902 issued to Michael Gamon et Al. (“Gamon”)  and further in view of USPBPUB 2005/0228783 issued to James Shanahan et al. (“Shanahan”).
With respect to claim 1, Gamon discloses a system, comprising: 
a trained multi-label support vector machine running a one-vs-the-rest classifier classifier (abstract and column 9, line 64 – column 10, line 3: training-time feature vectors are presented to classifier trainer which may be a training algorithm or a processing component configured to implement such an algorithm. Such a training algorithm can be an algorithm for a naive Bayes classifier, a support vector machine (SVM), or a maximum entropy classifier), 
wherein the trained multi-label support vector machine running the one-vs-the-rest classifier (abstract and column 9, line 64 – column 10, line 3: training-time feature vectors are presented to classifier trainer which may be a training algorithm or a processing component configured to implement such an algorithm. Such a training algorithm can be an algorithm for a naive Bayes classifier, a support vector machine (SVM), or a maximum entropy classifier)

document features of documents belonging to a plurality of label classes (column 10, lines 4 – 28: a classifier can be a function that maps an input attribute vector, x=(x.sub.1, x.sub.2, x.sub.3, . . . , x.sub.n), to a confidence that the input belongs to a class--that is, f(x)=confidence (class). For example, an SVM classifier can be employed--an SVM generally operates by finding a hyperplane that separates positive examples from negative examples in a multi-dimensional feature space. Other suitable classification approaches include Bayesian networks, neural networks, decision trees and probabilistic classification models providing different patterns of independence. Classification as used herein also is inclusive of statistical regression that is utilized to develop models of priority. The result of the training process is trained classifier. The trained classifier is saved out to a file, which completes the training phase), and 
hyperplane determinations on label classes in the plurality of label classes (column 10, lines 4 – 15 and 19 – 28: a classifier can be a function that maps an input attribute vector, x=(x.sub.1, x.sub.2, x.sub.3, . . . , x.sub.n), to a confidence that the input belongs to a class--that is, f(x)=confidence (class). For example, an SVM classifier can be employed--an SVM generally operates by finding a hyperplane that separates positive examples from negative examples in a multi-dimensional feature space. Other suitable classification approaches include Bayesian networks, neural networks, decision trees and probabilistic classification models providing different patterns of independence. Classification as used herein also is inclusive of statistical regression that is utilized to develop models of priority. At run-time, classifier can assign a score, with the help of document scoring module, to a run-time feature vector (generated from a document to be classified). For example, classifier can provide as an output a statistical probability of document being closer in quality to either the first or last versions of the training documents. Component can translate this probability into a desired score format. This score represents the likelihood of the document being closer to edited or unedited documents as observed at training-time).
With respect to claim 1, Gamon does not explicitly teach trained parameters that are learned from training the trained multi-label support vector machine running the one-vs-the-rest classifier and wherein the trained parameters include distributions of distances between the label classes and the hyperplanes.
Shanahan teaches trained parameters that are learned from training the trained multi-label support vector machine running the one-vs-the-rest classifier and wherein the trained parameters include distributions of distances between the label classes and the hyperplanes (Shanahan: abstract and Para [0007]: The parameters of the support vector machine (SVM) model are determined using a learning algorithm in conjunction with a training data set that characterizes the information need, i.e., a list of documents that have been labeled as positive or negative. Learning a support vector machine can be viewed both as a constraint satisfaction and optimization algorithm, where the first objective is to determine a hyperplane that classifies each labeled training example correctly, and where the second objective is to determine the hyperplane that is furthest from training data. Classifying an example using an SVM model reduces to determining which side of the hyperplane the example falls. If the example falls on the positive side of the hyperplane then the example is assigned a positive label; otherwise it is assigned a negative label. This form of learnt SVM is known as a hard SVM. Other flavors of SVM exist which relax the first objective. For example, not requiring all training examples to be classified correctly by the SVM leads to a flavor known as soft SVMs. In this case the SVM learning algorithm trades-off accuracy of the model with the margin of the model.).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the teachings of Gamon's evaluating run-time feature vector using machine-learned classifier to assess editorial quality with the teachings of Shanahan’s adjusting the model threshold of a support vector machine for text classification and filtering, so that a support vector machine is trained using the document features for one-vs-the-rest training and hyperplane determinations on the label classes.
Modification would facilitates for applying the scoring function that classifies document based on sign of preset equation to the example documents, thus setting and updating the threshold score of a learnt support vector machine model.

As to claim 2, apply the trained multi-label support vector machine running the one-vs-the-rest classifier on multi-label classifications of documents (Gamon, column 10, lines 19 – 32: At run-time, classifier can assign a score, with the help of document scoring module, to a run-time feature vector (generated from a document to be classified). For example, classifier can provide as an output a statistical probability of document being closer in quality to either the first or last versions of the training documents. Component can translate this probability into a desired score format. This score represents the likelihood of the document being closer to edited or unedited documents as observed at training-time. The score can be binary (i.e., "needs further work" or "does not need further work") or continuous (i.e., "the document scores 80 out of 100 points for style"). Thus, in addition to a numeric score, other quality/style assessment outputs are possible);
As to claim 3, the document features represent frequencies or semantics of words in a document (Gamon, column 6, lines 47 – 63: Logical Form Type Linguistic Analysis Features: 
Examples of linguistic analysis type features include features based upon logical forms (LFs). LFs are generated by performing a morphological and syntactic analysis on an input text to produce conventional phrase structure analyses augmented with grammatical relations. Syntactic analyses undergo further processing in order to obtain LFs, which are data structures that describe labeled dependencies among content words in the textual input).
As to claim 4, the document features include frequency features based on term frequency-inverse document frequency (TF-IDF) (Shanahan: Para [0043]: This vector space representation is known as the TF_IDF representation. Similarly, under the vector-space representation, linear SVM information needs can be conceptually viewed as vectors of features, such as words, noun phrases, and other linguistically derived features (e.g., parse tree features)). 
As to claim 5, the document features include semantic features based on embedding in a multi-dimensional vector space using Word2Vec (Gamon: column 6, lines 10 – 15:  features extracted by component include Grammar and spelling related features, Word n-grams, Linguistic analysis features based on automatic syntactic and semantic analysis of sentences in a document).
As to claim 6, the document features include semantic features based on embedding in a multi-dimensional vector space using global vectors for word representation (GloVe) (Gamon: column 6, lines 1 – 15: during training, training documents (first drafts and final versions) are input into feature extraction component. Component extracts features from each of training documents and generates a training-time vector for each one of training document, thereby producing a plurality of training-time vectors. It should be noted that each of the plurality of training-time vectors includes a designator of the editorial quality (e.g., first draft, final version, etc.) of the training document to which it corresponds. Features extracted by component include Grammar and spelling related features, Word n-grams, Linguistic analysis features based on automatic syntactic and semantic analysis of sentences in a document).
As to claim 7, select hyper parameters of the trained multi- label support vector machine running the one-vs-the-rest classifier across regularization, class weight, and loss function in a predetermined search range such that an at-least-one score is at or within ten percent of maximum attainable over the predetermined search range (Shanahan: Para [0006]:  An SVM model can be viewed geometrically as a hyperplane (or hypersurface) that partitions two classes of objects in a multi-dimensional feature space into two disjoint subsets; in our case the hyperplane partitions documents into a positive set corresponding to documents that satisfy an information need and into a negative set corresponding to documents that do not satisfy an information need. Mathematically, a linear SVM (non-linear SVMs will be presented subsequently) can be represented in the following two equivalent forms: using a weight vector representation; or using a support vector representation. The weight vector representation mathematically represents an SVM (the separating hyperplane) as a pair of parameters <W, b>, where W denotes a weight vector and b represents a threshold or bias term).
As to claim 8, the at-least-one score calculates a ratio of count of the documents with at least one pairwise match between inferred labels and ground truth labels to a total number of documents with at least one ground truth label As to claims 6 and 22, wherein the ALO score calculates a ratio of count of the documents with at least one pairwise match between inferred labels and ground truth labels to the total number of documents with at least one ground truth label (Shanahan: Para [0006]:  An SVM model can be viewed geometrically as a hyperplane (or hypersurface) that partitions two classes of objects in a multi-dimensional feature space into two disjoint subsets; in our case the hyperplane partitions documents into a positive set corresponding to documents that satisfy an information need and into a negative set corresponding to documents that do not satisfy an information need. Mathematically, a linear SVM (non-linear SVMs will be presented subsequently) can be represented in the following two equivalent forms: using a weight vector representation; or using a support vector representation. The weight vector representation mathematically represents an SVM (the separating hyperplane) as a pair of parameters <W, b>, where W denotes a weight vector and b represents a threshold or bias term. Please also see Para [035])).

Allowable Subject Matter
Claim 9 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
The prior art of record, U.S. Patent 7,835,902 (Gamon) teaches a machine-learned classifier is trained based on training-time feature vectors that are generated by automatically extracting features from first and last versions of the training documents. A run-time feature vector is generated for textual unit to be assessed by automatically extracting features from the textual unit. A run-time feature vector is evaluated using the classifier to assess the editorial quality.
The prior art of record, US Patent Application Publication 2011/0099500 issued to Weston discloses a simple feature ranking can be produced by evaluating how well an individual feature contributes to the separation (e.g. cancer vs. normal). Various correlation coefficients have been proposed as ranking criteria.  The coefficient used by Golub et al. is defined as: 
w.sub.i=(.mu..sub.i(+)-.mu..sub.i(-))/.sigma..sub.i(+)+.sigma..sub.i(-)) (2) where .mu..sub.i and .sigma..sub.i are the mean and standard deviation, respectively, of the gene expression values of a particular gene i for all the patients of class (+) or class (-), i=1.
The prior art of record, US Patent Application Publication 2005/0228783 issued to Shanahan, discloses a model from example documents marked as positive/negative with respect to a request is learnt. Performance of model settings is evaluated on example documents. An adjustment algorithm providing threshold value is applied. A scoring function classifying the document based on a sign of a preset equation is applied to documents. SVMs (support vector machine ) can exhibit very conservative precision oriented behavior when modeling information needs. This conservative behavior can be overcome by adjusting the position of the hyperplane, the geometric representation of a SVM.
These prior art made of record do not teach or fairly describe one of the label classes is parked domain, for documents posted on parked domains, further configured to: identify parked domains and collect documents posted on the parked domains by crawling uniform resource locators (URLs) that are within a predetermined edit distance of selected URL names, determining for at least some of the crawled URLs that URL resolution is referred to an authoritative nameserver that appears in a list of parked domain nameservers identified as dedicated to parked domains, and collecting the documents posted on the crawled URLs that are referred to the parked domain nameservers; and label the collected documents as collected from the parked domains and store the documents and parked domain labels for use in training.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAHID AL ALAM whose telephone number is (571)272-4030.  The examiner can normally be reached on M-F 8:00 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on 571-272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

November 5, 2022
/SHAHID A ALAM/Primary Examiner, Art Unit 2162