DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  This action is in response to the communication filed on 1/28/2021. Claims 1-20 are pending in this application.
Priority
This application is claims priority of 16/023,413 filed 6/29/2018. The assignee of record is FORESCOUT TECHNOLOGIES, INC. The listed inventors are: Yang, Siying; Zhang, Yang.
Examiner Note
If applicant has any questions or wishes to amend claims, applicant is encouraged to contact the examiner to ensure that any proposed amendments would overcome current rejection(s). The examiner can normally be reached at (571)270-3863 or michael.keller@uspto.gov, Monday-Friday, 9 AM - 10 PM EST, and examiner is happy assist applicant as needed to provide any help/feedback, thank you.
Terminal Disclaimer
The terminal disclaimer filed on 1/28/2021 has been reviewed and is accepted.  
Double Patenting- WITHDRAWN
Claims 1-20 rejection on the ground of nonstatutory double patenting as being unpatentable over the claims of U.S. Patent No(s). 10,812,334 is WITHDRAWN because of the filed terminal disclaimer on 1/28/2021.
Response to Arguments
Applicant’s arguments filed 1/28/2021 have been fully considered but they are not persuasive. Applicant argues:
a.

    PNG
    media_image1.png
    762
    718
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    396
    688
    media_image2.png
    Greyscale

a. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Fir ¶ 0105 and Fig. 5 show a feature set 140 and subsets 142 and 144. The subset 142 comprises ASNs with features based on web searches and is trained by web search classification model 79 while the subset 144 comprises ASNs with features based on network traffic and is trained by network traffic classification model 77. Fir ¶ 0078-0079 and throughout disclose use of classification based on type, e.g. network traffic, web searches. Examiner explicitly put in the previous action “While Fir appears to clearly teach reliability Fir does not explicitly use the term “reliability level.” However, Yon teaches reliability level (Yon ¶ 0071 reliability level can be updated).” Examiner notes that the classification model being used in Fir is based on the type of data, and Fir lacks a explicit mention of reliability level but Yon explicitly shows reliability level and Yon and Fir are analogous art because they are both related to data collection and it would have been obvious to one of ordinary skill in the art to use the reliability level techniques of Yon with the system of Fir to increase reliability of data during a collection interval (Yon ¶ 0003).
Response to Amendment
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3, 5-7, 10, 12, & 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Firstenberg et al. (US 20180069884 A1, published 3/8/2018; hereinafter Fir) in view of Yonezawa (US 20130079940 A1, published 3/28/2013; hereinafter Yon).
For Claim 1, Fir teaches a training method (Fir’s Figures describe methods, systems and CRMs, e.g. FIG. 1 is a block diagram that schematically shows a computing facility comprising an anomaly detection system that can identify Internet sites belonging to bulletproof autonomous systems, Fig. 3 is a flow diagram that schematically illustrates a method of creating a model to identify Internet sites belonging to bulletproof autonomous systems, and Fig. 5 is a block diagram that schematically illustrates co-training two classifiers to identify Internet sites belonging to bulletproof autonomous systems.
For reference purposes screenshots of relevant Figures from Fir are provided below:

    PNG
    media_image3.png
    550
    788
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    516
    792
    media_image4.png
    Greyscale


    PNG
    media_image5.png
    852
    525
    media_image5.png
    Greyscale


    PNG
    media_image6.png
    729
    516
    media_image6.png
    Greyscale


    PNG
    media_image7.png
    543
    869
    media_image7.png
    Greyscale

) comprising:
accessing a plurality of device classification methods (Fir Fig. 5, 77 and 79), wherein each of the plurality of methods has a respective associated model (Fir ¶ 0101 2 independent learners each having a different view/classification model for the data), and wherein each of the plurality of methods has a respective associated reliability(Fir Fig. 5 “Web Search Classification Model” and “Network Classification Model” Fir Fig. 5 Subset X1 comprises features based on web searches (Fir ¶ 0105) used to train Web Search Classification Model 79, a Web Search Classification Model has a higher reliability for features based on web searches than Network Classification Model 77. Examiner also notes Fir ¶ 0099-0101 which recites:
[0099] Co-Training is an extension of bootstrapping (described supra), in which 
there is more than one classifier model used to classify data.  This way the 
classifiers can provide labelled samples to each other, reducing the 
probability of self-bias and errors in labeling. 
 
[0100] Co-training is an algorithm in the framework of semi-supervised 
learning, which begins from a small number of labeled samples and many 
unlabeled samples.  The more labeled data that is available, the better the 
classifiers perform.  The goal of co-training is to increase the number of 
labeled samples. 
 
[0101] Co-training uses two independent learners, each using a different view 
of (i.e., a classification model for) the data.  Examples of classification 
models (also referred to herein as sub-models) include models for network 
traffic, web search results, blacklists, names of the ASNs, threat intelligence 
data and a number of ASNs 94 associated with a given domain 84.  In one 
embodiment, one or more of the sub-models (i.e., learners) can be used to 
identify ASN properties.  In another embodiment, two or more sub-models can be 
co-trained, and the co-trained sub-models can be used to identify ASN 
properties.  The predictions of one learner are used in order to teach the 
other learner in a cyclic manner (i.e., in each round each of the learners 
outputs an updated model).  This way each learner gets a larger number of 
labeled samples that enable it to reach better predictions. 
 
[0102] The goal of the method is to use co-training in order to better identify 
ASN properties.  This can be done by using datasets from different sources 
(e.g., network traffic, threat intelligence sources, web site references to the 
AS) in order to get independent classifiers that can be used in co-training.);
generating a respective data set (Fir Fig. 5 Set X 140) associated with each of the device classification methods based on classifying a plurality of devices communicatively coupled to a network (Fir ¶ 0105 recites:
[0105] Processor 60 uses a set of a feature set 140 that comprises labeled data 
for models 77 and 79.  Feature set 140 comprises a set of ASNs 94 that may be 
labeled (e.g., manually).  Feature set 140 comprises a subset 142 comprising 
ASNs with features based on web searches (i.e., data in records 104), and a 
subset 144 comprising ASNs with features based on network traffic (i.e., data 
in records 80).  Web search classification algorithm accepts subset 142 and a 
set 146 of unlabeled ASNs 94 as inputs, and performs a fitting and 
classification to generate a new labeled data set 148, which processor 60 uses 
to update subset 144.  Network traffic classification algorithm 77 accepts 
subset 142 and a set 150 of unlabeled ASNs 94 as inputs, and performs a fitting 
and classification to generate a new labeled data set 152, which processor 60 
uses to update subset 142, and the training process is repeated until the 
models 77 and 79 are trained, as described in Appendix 3 hereinbelow.);
selecting a first device classification method and a second device classification method (Fir Fig. 5 77 and 79) of the plurality of device classification methods, wherein the first device classification method has a higher reliability (Fir Fig. 5 Subset X1 comprises features based on web searches (Fir ¶ 0105) used to train Web Search Classification Model 79, a Web Search Classification Model has a higher reliability for features based on web searches than Network Classification Model 77);
determining a training data set using a respective data set associated with the first device classification method (Fir Fig. 5 Subset X1 comprises features based on web searches is used to train Web Search Classification Model 79, a Web Search Classification Model has a higher reliability for features based on web searches than Network Classification Model 77);
training, by a processing device, the second device classification method model using the training data set (Fir Fig. 5 the New Labeled Data Set 148 updates Subset X2 resulting in Example Set L 144 which is used to train Network Traffic Classification Model 77); and
storing the trained second device classification model (Fir Fig. 2 memory).
While Fir appears to clearly teach reliability Fir does not explicitly use the term “reliability level.”
However, Yon teaches reliability level (Yon ¶ 0071 reliability level can be updated).
Yon and Fir are analogous art because they are both related to data collection.
(Yon ¶ 0003).
For Claim 3, Fir-Yon teaches the training method of claim 1, further comprising: performing classification using the second device classification method (Fir Fig. 5 77, 79).
For Claim 5, Fir-Yon teaches the training method of claim 1, wherein each respective model associated with the plurality of device classification methods is a machine learning model (Fir Fig. 5, ¶ 0033 and throughout teaches machine learning).
For Claim 6, Fir-Yon teaches the training method of claim 1, wherein the respective associated reliability level associated with the plurality of device classification methods is configurable (Yon ¶ 0071 reliability level can be updated).
For Claim 7, Fir-Yon teaches the training method of claim 1, wherein the respective associated reliability level associated with a device classification methods is automatically adjusted based on one or more classification results based on the device classification method (Yon ¶ 0071-0072 reliability level can be updated by Expression 4 and recalculating of reliability level with data collection cycle being doubled).
For Claim 10, the claim is substantially similar to claim 1 and therefore is rejected for the same reasons set forth above. 
For Claim 12, the claim is substantially similar to claim 3 and therefore is rejected for the same reasons set forth above. 
For Claim 14, the claim is substantially similar to claim 5 and therefore is rejected for the same reasons set forth above. 
For Claim 15, the claim is substantially similar to claim 6 and therefore is rejected for the same reasons set forth above. 

Claims 2, 4, 8-9, 11, 13, 16-17, & 18-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fir-Yon as applied to claim 2 above, and further in view of Ho et al. (US 20190166024 A1, filed 11/24/2017; hereinafter Ho).
For Claim 2, Fir-Yon teaches the training method of claim 1, further comprising: performing an initial classification of the plurality of devices communicatively coupled to the network (Fir ¶ 0105 discloses an initial classification and ¶ 0101 teaches multiple classification models).
Fir does not explicitly teach determining which of the plurality of device classification methods can be used based on the initial classification of the plurality of devices communicatively coupled to the network.
However, teaches determining which of the plurality of device classification methods can be used based on the initial classification of the plurality of devices communicatively coupled to the network (Ho Fig. 3 ¶ 0045 teaches generating a classification model and a clustering model according to a classification algorithm and a clustering algorithm respectively, and then testing the accuracy rate of the classification model and the clustering model with another subset of the principal component data, and based on accuracy rate reaching a threshold determining whather to output the classification model and clustering model or to select another subset of component data and repeat back to step 305. 
Examiner notes that clarifications to what is meant or how the determining is performed may be one way to overcome the current rejection. 
Ho Fig. 3 screen shot provided below:

    PNG
    media_image8.png
    821
    565
    media_image8.png
    Greyscale
).
Ho and Fir-Yon are analogous art because they are both related to classification training.
to provide a technology which is capable of objectively selecting more important network parameters in a network environment for detecting and analyzing network anomalies (Ho ¶ 0004) and because such a combination would provide more accurate results (Ho ¶ 0046).
For Claim 4, Fir-Yon teaches the training method of claim 1, wherein the training of the second device classification method model using the training data set is performed on a per device basis (Ho Fig. 3 shows a method adapted for a electronic computing apparatus (per device basis). Ho ¶ 0033 recites:
[0033] A second embodiment of the present invention is a network anomaly 
analysis method, and a flowchart diagram thereof is depicted in FIG. 3.  The 
network anomaly analysis method is adapted for an electronic computing 
apparatus (e.g., the network anomaly analysis apparatus 1 of the first 
embodiment).  In this embodiment, the electronic computing apparatus stores a 
plurality of network status data, wherein each of the network status data 
comprises a plurality of network feature values.).
For Claim 8, Fir-Yon teaches the training method of claim 1, wherein the selecting of the first device classification method and the second device classification method of the plurality of device classification methods is based on a network environment (Ho ¶ 0046 classification and clustering models trained by anomaly analysis are suitable for various network environments and provides more accurate network anomaly detection results. Ho ¶ 0045-0046 recites:
[0045] According to the above descriptions, the network anomaly analysis 
technology (including the apparatus, method, and the non-transitory computer 
readable storage medium thereof) provided by the present invention 
dimension-reduces the collected network status data to obtain more 
representative principal component data (i.e., excludes network feature values 
of less importance in the network status data), selects a subset of the 
principal component data as the training data, generates a classification model 
and a clustering model according to a classification algorithm and a clustering 
algorithm respectively, and then tests the accuracy rate of the classification 
model and the clustering model with another subset of the principal component 
data.  If the accuracy rate fails to reach a preset value, the network anomaly 
analysis technology provided by the present invention selects another subset of 
the principal component data to refine the classification model and the 
clustering model, wherein the another subset is selected by taking other 
factors (e.g., the time factor, the regional factor, or the distance to the 
classification model) into consideration. 
[0046] The classification model and the clustering model trained by the network 
anomaly analysis technology according to the present invention are suitable for 
various network environments and, thereby, solves the problem that the network 
parameters need to be determined by professionals and are limited to particular 
network environments in the prior art.  Moreover, the network anomaly analysis 
technology of the present invention eliminates the overfitting problem caused 
by less important network feature values in the training process and, thereby, 
improves the accuracy of the trained classification model and the clustering 
model and provides more accurate network anomaly detection results.).
For Claim 9, Fir-Yon teaches the training method of claim 1, wherein the first device classification method comprises at least one of an agent based classification method, an aggregator based method, an active probing based method, a passive traffic analysis method, a traffic log analysis method, or a traffic based behavior heuristic method (Ho Fig. 3 ¶ 0036 recites in part: the clustering algorithm adopted in the step S307 may be a K-means algorithm, an agglomerative clustering algorithm or a divisive clustering algorithm, but it is not limited thereto and Ho ¶ 0035 recites in part: the classification algorithm adopted in the step S305 may be a support vector machine, a linear classification algorithm and a K-nearest neighbor algorithm, but it is not limited thereto).
For Claim 11, the claim is substantially similar to claim 2 and therefore is rejected for the same reasons set forth above. 
For Claim 13, the claim is substantially similar to claim 4 and therefore is rejected for the same reasons set forth above. 
For Claim 16, the claim is substantially similar to claim 8 and therefore is rejected for the same reasons set forth above. 
For Claim 17, the claim is substantially similar to claim 9 and therefore is rejected for the same reasons set forth above. 
For Claim 18, the claim is substantially similar to claim 2 and therefore is rejected for the same reasons set forth above. 
For Claim 19, the claim is substantially similar to claim 4 and therefore is rejected for the same reasons set forth above. 
For Claim 20, the claim is substantially similar to claim 8 and therefore is rejected for the same reasons set forth above. 
Citation of Pertinent Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed below, thank you:
i. US 20200127892 A1- Abstract- In one embodiment, a device classification service extracts, for each of a plurality of time windows, one or more sets of traffic features of network traffic in a network from traffic telemetry data captured by the network. The service represents, for the time windows, the extracted one or more sets of traffic features as feature vectors. A feature vector for a time window indicates whether each of the traffic features was present in the network traffic during that window. The service trains, using a training dataset based on the feature vectors, a cascade of machine learning classifiers to label devices with device types. The service uses the classifiers to label a particular device in the network with a device type based on the traffic features of network traffic associated with that device. The service initiates enforcement of a network policy regarding the device based on its device type.
ii. US 20200067935 A1- Abstract- A network edge device includes switching circuitry configured to switch traffic from one or more endpoint devices to corresponding application services over a network; and processing circuitry configured to monitor the traffic from the one or more endpoint devices, compare the monitored traffic to classify the one or more endpoint devices into a corresponding trust level of a plurality of trust levels, and route the traffic from each of the one or more endpoint devices based on its corresponding trust level. The network edge element is configured to provide network connectivity to the one or more endpoint devices.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning communications from the examiner should be directed to Michael Keller at (571)270-3863 or michael.keller@uspto.gov. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Gillis can be reached on 571-272-7952. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL A KELLER/
Primary Patent Examiner, Art Unit 2446