DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

This action is written in response to the arguments filed on November 22, 2021.  Claims 1-20 are currently pending and have been considered below.

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder). 

As to claim 1, Stojanovic teaches a method comprising: 
obtaining, by a processor, data (see paragraphs [0058], data is ingested during prepare processing, the data (or samples thereof) can be stored in a distributed data storage system 210 (such as a big data" cluster ([0058])) ; 
clustering, by the processor, features of the data into a plurality of clusters based on similarities of the200 can be implemented in an environment such as a cluster 210 for big data operations ("Big Data Cluster")…, semantic processing pipeline;  [0085]…”information identified based on the data may be compared to know types of data (e.g., business information, personal identification information, or address information) to identify the data that corresponds to a pattern” includes “features of the data”;  [0154], K-Means clustering (or other vector analysis) can be used to analyze vectors corresponding to the set of input words, and determine how similar those input words are, based on how "close" the corresponding vectors are within a vector space; [0058], wherein Examiner interprets the processing stages described above with respect to FIG. 1, can include a number of processing engines to include the processing rates; [0066]… the analysis engine can query Analysis Config Library for the metrics that should be analyzed across a cluster…. query Resource/Metadata Store to determine the member resources in each cluster…; [0173]…curated data may include curated categories and types in one or more files.  The types may include a taxonomy of terms to better identify a category for data 1502; [0227]…The order information received by cloud infrastructure system 2002 in response to the customer placing an order may include information identifying the customer and one or more services offered by the cloud infrastructure system 2002 that the customer intends to subscribe to…; [0234], cloud infrastructure system, information that authenticates the identities of such customers and information that describes which actions those customers are authorized to perform relative to various system resources (e.g., files, directories, applications, communication ports, memory segments, etc.), wherein using the broadest reasonable interpretation Examiner interprets the “…in response to the customer placing…” to include” the response time,” and the “files and directories” to include “permissions and top users”). 
But Stojanovic fails to explicitly teach storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions.
However EATON teaches the features of the data in each cluster comprise storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions (see paragraphs [0012]-[0013]…the inferred cluster may include resources that are expected to perform similarly to each other, and the cluster analytics may include detecting outlier resources that do not perform similarly to other resources within the inferred cluster…The metadata may include …IaaS-tenant metadata expressed in arbitrary text specific to the IaaS tenant that the IaaS tenant uses to characterize the resources used in providing the distributed service; [0031]…Provider APIs 305 can return data about the behavior of the virtual infrastructure (e.g., CPU utilization, disk I/0 rates); [0037]…the monitoring system can first verify that it has sufficient permissions from the customer and can then call Provider APIs 305 to start the action; [0064]…one web server may deviate from its peers (e.g., have significantly slower response times, or have a much longer backlog of requests); and [0076]… responsiveness of Provider APIs 305 …).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of Stojanovic to add “storage infrastructure metrics including processing rates and permissions in each cluster to Stojanovic’s system as taught by EATON above.  The modification would have been obvious because one of ordinary skill would be motivated to infer the service architecture having roles and relationships among resources based on the metadata and without human operator modeling input and information regarding actual physical network connectivity between resources, as suggested by EATON ([0052]).
But Stojanovic and EATON fail to explicitly teach:
metadata including owner and access traits; and 
     for each cluster of the plurality of clusters: 
             determining, by the processor, a percentage of the features of the data in the
     cluster; and 
              performing, by the processor, a random sampling process to randomly sample representative points of features from the cluster, wherein a number of representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster;
       processing, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Conway teaches metadata including owner and access traits (see page 19, second paragraph…system metadata can include characteristics of files and collections, such as ownership, access control relationships, provenance, and collection membership).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic and EATON to add metadata including owner and access traits to the combination system of Stojanovic and EATON’s system as taught by Conway above.  The modification would have been obvious because one of ordinary skill would be motivated to have metadata information that improves the accuracy of infrastructure metrics, as suggested by Conway (see page 19, second paragraph).
But Stojanovic, EATON and Conway fail to explicitly teach:
for each cluster of the plurality of clusters: 
             determining, by the processor, a percentage of the features of the data in the
     cluster; and 
              performing, by the processor, a random sampling process to randomly
sample representative points of features from the cluster, wherein a number of
representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster; 
             processing, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Datta teaches:
for each cluster of the plurality of clusters: 
       determining, by the processor, a percentage of the features of the data in the cluster; and 
     performing, by the processor, a random sampling process to randomly sample representative points of features from the cluster, wherein a number of representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster (see paragraphs [0044]-[0045]..For each motionlet cluster 202 the process randomly samples a smaller set of positive samples (in one example, 5000 images), trains a complementary detector 56	 applies the detector tuned to have very few or no false alarms (and all other already trained detectors in the pool) to the set of positive images of the motionlet cluster 202, and selects those that are misclassified for training another complementary detector…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON and Conway to add a random sampling process for each cluster to the combination system of Stojanovic, EATON and Conway’s system as taught by above.  The modification would have been obvious because one of ordinary skill would be motivated to minimize false negatives at the expense of a larger number of false positives, and wherein a collection of weak classification functions is combined to form a stronger classifier having a lowest classification error, as suggested by Datta ([0031]).
But Stojanovic, EATON, Conway and Datta fail to explicitly teach:
          processing, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
Furuichi teaches applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential (paragraph [033]… label determination section 214 about the availability of a confidential label of data related to the user operation…; [0036]-[0038]… determining a confidential label, and provides it to the label determination section (214)…confidential label with the highest confidentiality level among the confidential labels obtained by the determinations to be the confidential label of the data…the label determination section (214) determines a confidential label corresponding to the data based on policy information acquired by the policy reference section (215) ; [0126] … a word list of text (e.g. a dictionary)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway and Datta to add a confidentiality label to the combination system of Stojanovic, EATON, Conway and Datta as taught by above.  The modification would have been obvious because one of ordinary skill would be motivated to categorize the content into a predetermined type based on labeling policy information, as suggested by Furuichi ([0036]).
But Stojanovic, EATON, Conway, Datta and Furuichi fail to explicitly teach:
             processing, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters; and
              generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Lingenfelder, in combination with Stojanovic, EATON, Conway, Datta and Furuichi, teaches:
              processing, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters (paragraph [0047] …for each cluster of training data in the set, a prediction model is created using only the data points in that cluster…); and
             generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features (paragraphs [0044]- [0048] …training data will be referred to herein as training data points. Each training data point is a set of covariates together with a known prediction gathered from historical data (for instance, it is known from past data that a certain amount of water consumption in a building occurred in the past at a certain time/day of the week)….prediction model is created for each cluster of training data points-multiple m clusters are then clustered into "prediction clusters," thus each prediction cluster might have multiple prediction models associated therewith; [0049]… Monte-Carlo sampling….; [0050]…on a prediction, the union of all tags associated to any clusters belonging to a prediction cluster is returned using any ranking scheme…; wherein using the broadest reasonable interpretation, Examiner interprets the training data points and union of all tags to include confidential labels); and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified (paragraphs [0017]-[0021]…Training data - data on which the model is trained. Training data consists of a set of data points…Data clusters - clusters of the training data. On each
cluster a prediction model is trained…final prediction is determined based on majority vote).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta and Furuichi to add a trained prediction model to the combination system of Stojanovic, EATON, Conway, Datta and Furuichi as taught by Lingenfelder above.  The modification would have been obvious because one of ordinary skill would be motivated to enhance the accuracy of the predictions produced, as suggested by Lingenfelder ([0034]).

As to claim 9, Stojanovic teaches computer program product for performing efficient data sampling across a storage stack for training machine learning (ML) models, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: 
obtain, by a processor, data (paragraph [0058], data is ingested during prepare processing the data (or samples thereof) can be stored in a distributed data storage system 210 (such as a big data" cluster ([0058])); 
cluster, by the processor, features of the data into a plurality of clusters based on similarities of the data across an entire storage stack, wherein the features of the data comprise storage infrastructure metric including processing rates, response time and permissions, file metrics from metadata including owner, top users and user access traits, and application dependency taxonomy including application type (see paragraphs [0057], technology stack, [0060]-[0063]…Technology stack 200 can be implemented in an environment such as a cluster 210 for big data operations ("Big Data Cluster")…, semantic processing pipeline; [0085]…”information identified based on the data may be compared to know types of data (e.g., business information, personal identification information, or address information) to identify the data that corresponds to a pattern” includes “features of the data”;  [0154], K-Means clustering (or other vector analysis) can be used to analyze vectors corresponding to the set of input words, and determine how similar those input words are, based on how "close" the corresponding vectors are within a vector space; [0058], wherein Examiner interprets the processing stages described above with respect to FIG. 1, can include a number of processing engines to include the processing rates; [0173]…curated data may include curated categories and types in one or more files.  The types may include a taxonomy of terms to better identify a category for data 1502; [0227]…The order information received by cloud infrastructure system 2002 in response to the customer placing an order may include information identifying the customer and one or more services offered by the cloud infrastructure system 2002 that the customer intends to subscribe to…; [0234], cloud infrastructure system, information that authenticates the identities of such customers and information that describes which actions those customers are authorized to perform relative to various system resources (e.g., files, directories, applications, communication ports, memory segments, etc.), wherein using the broadest reasonable interpretation Examiner interprets the “…in response to the customer placing…” to include” the response time,” and the “files and directories” to include “permissions and top users”).
But Stojanovic fails to explicitly teach storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions.
However EATON teaches the features of the data in each cluster comprise storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions (see paragraphs [0012]-[0013]…the inferred cluster may include resources that are expected to perform similarly to each other, and the cluster analytics may include detecting outlier resources that do not perform similarly to other resources within the inferred cluster…The metadata may include …IaaS-tenant metadata expressed in arbitrary text specific to the IaaS tenant that the IaaS tenant uses to characterize the resources used in providing the distributed service; [0031]…Provider APIs 305 can return data about the behavior of the virtual infrastructure (e.g., CPU utilization, disk I/0 rates); [0037]…the monitoring system can first verify that it has sufficient permissions from the customer and can then call Provider APIs 305 to start the action; [0064]…one web server may deviate from its peers (e.g., have significantly slower response times, or have a much longer backlog of requests); and [0076]… responsiveness of Provider APIs 305 …).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of Stojanovic to add “storage infrastructure metrics including processing rates and permissions in each cluster to Stojanovic’s system as taught by EATON above.  The modification would have been obvious because one of ordinary skill would be motivated to infer the service architecture having roles and relationships among resources based on the metadata and without human operator modeling input and information regarding actual physical network connectivity between resources, as suggested by EATON ([0052]).
But Stojanovic and EATON fail to explicitly teach:
metadata including owner and access traits; and 
     for each cluster of the plurality of clusters: 
             determine, by the processor, a percentage of the features of the data in the
     cluster; and 
              perform, by the processor, a random sampling process to randomly sample representative points of features from the cluster, wherein a number of representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster;
               process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and apply, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Datta teaches:
for each cluster of the plurality of clusters: 
       determine, by the processor, a percentage of the features of the data in the cluster; and 
     perform, by the processor, a random sampling process to randomly sample representative points of features from the cluster, wherein a number of representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster (see paragraphs [0044]-[0045]..For each motionlet cluster 202 the process randomly samples a smaller set of positive samples (in one example, 5000 images), trains a complementary detector 56	 applies the detector tuned to have very few or no false alarms (and all other already trained detectors in the pool) to the set of positive images of the motionlet cluster 202, and selects those that are misclassified for training another complementary detector…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON and Conway to add a random sampling process for each cluster to the combination system of Stojanovic, EATON and Conway’s system as taught by above.  The modification would have been obvious because one of ordinary skill would be motivated to minimize false negatives at the expense of a larger number of false positives, and wherein a collection of weak classification functions is combined to form a stronger classifier having a lowest classification error, as suggested by Datta ([0031]).
But Stojanovic, EATON, Conway and Datta fail to explicitly teach:
          process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and apply, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
Furuichi teaches apply, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential (paragraph [033]… label determination section 214 about the availability of a confidential label of data related to the user operation…; [0036]-[0038]… determining a confidential label, and provides it to the label determination section (214)…confidential label with the highest confidentiality level among the confidential labels obtained by the determinations to be the confidential label of the data…the label determination section (214) determines a confidential label corresponding to the data based on policy information acquired by the policy reference section (215) ; [0126] … a word list of text (e.g. a dictionary)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway and Datta to add a confidentiality label to the combination system of Stojanovic, EATON, Conway and Datta as taught by above.  The modification would have been obvious because one of ordinary skill would be motivated to categorize the content into a predetermined type based on labeling policy information, as suggested by Furuichi ([0036]).
But Stojanovic, EATON, Conway, Datta and Furuichi fail to explicitly teach:
             process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters; and
              generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Lingenfelder, in combination with Stojanovic, EATON, Conway, Datta and Furuichi, teaches:
              process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters (paragraph [0047] …for each cluster of training data in the set, a prediction model is created using only the data points in that cluster…); and
             generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features (paragraphs [0044]- [0048] …training data will be referred to herein as training data points. Each training data point is a set of covariates together with a known prediction gathered from historical data (for instance, it is known from past data that a certain amount of water consumption in a building occurred in the past at a certain time/day of the week)….prediction model is created for each cluster of training data points-multiple m clusters are then clustered into "prediction clusters," thus each prediction cluster might have multiple prediction models associated therewith; [0049]… Monte-Carlo sampling….; [0050]…on a prediction, the union of all tags associated to any clusters belonging to a prediction cluster is returned using any ranking scheme…; wherein using the broadest reasonable interpretation, Examiner interprets the training data points and union of all tags to include confidential labels); and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified (paragraphs [0017]-[0021]…Training data - data on which the model is trained. Training data consists of a set of data points…Data clusters - clusters of the training data. On each
cluster a prediction model is trained…final prediction is determined based on majority vote).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta and Furuichi to add a trained prediction model to the combination system of Stojanovic, EATON, Conway, Datta and Furuichi as taught by Lingenfelder above.  The modification would have been obvious because one of ordinary skill would be motivated to enhance the accuracy of the predictions produced, as suggested by Lingenfelder ([0034]).

As to claim 16, Stojanovic teaches an apparatus comprising: 
a storage device configured to receive data (see paragraphs [0058], data is ingested during prepare processing, the data (or samples thereof) can be stored in a distributed data storage system 210 , [0058] (such as a “big data" cluster); 
a clustering processor configured to cluster features of the data into a plurality of clusters based on similarities of the data across an entire storage stack comprise
storage infrastructure metrics, file metrics and application dependency taxonomy including application type (see paragraphs [0057], technology stack, [0060]-[0063]…Technology stack 200 can be implemented in an environment such as a cluster 210 for big data operations ("Big Data Cluster")…, semantic processing pipeline; [0154], K-Means clustering (or other vector analysis) can be used to analyze vectors corresponding to the set of input words, and determine how similar those input words are, based on how "close" the corresponding vectors are within a vector space; [0058], wherein Examiner interprets the processing stages described above with respect to FIG. 1, can include a number of processing engines to include the processing rates;[0173]…curated data may include curated categories and types in one or more files.  The types may include a taxonomy of terms to better identify a category for data 1502; [0234], cloud infrastructure system, information that authenticates the identities of such customers and information that describes which actions those customers are authorized to perform relative to various system resources (e.g., files, directories, applications, communication ports, memory segments, etc.).
But Stojanovic fails to explicitly teach storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions.
However EATON teaches the features of the data in each cluster comprising: storage infrastructure metrics including processing input and output (I/0) rates, storage response time and permissions (see paragraphs [0012]-[0013]…the inferred cluster may include resources that are expected to perform similarly to each other, and the cluster analytics may include detecting outlier resources that do not perform similarly to other resources within the inferred cluster…The metadata may include …IaaS-tenant metadata expressed in arbitrary text specific to the IaaS tenant that the IaaS tenant uses to characterize the resources used in providing the distributed service; [0031]…Provider APIs 305 can return data about the behavior of the virtual infrastructure (e.g., CPU utilization, disk I/0 rates); [0037]…the monitoring system can first verify that it has sufficient permissions from the customer and can then call Provider APIs 305 to start the action; [0064]…one web server may deviate from its peers (e.g., have significantly slower response times, or have a much longer backlog of requests); and [0076]… responsiveness of Provider APIs 305 …).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of Stojanovic to add “storage infrastructure metrics including processing rates and permissions in each cluster to Stojanovic’s system as taught by EATON above.  The modification would have been obvious because one of ordinary skill would be motivated to infer the service architecture having roles and relationships among resources based on the metadata and without human operator modeling input and information regarding actual physical network connectivity between resources, as suggested by EATON ([0052]).
But Stojanovic and EATON fail to explicitly teach:
metadata including owner and access traits; and 
     for each cluster of the plurality of clusters: 
             determine, by the processor, a percentage of the features of the data in the
     cluster; and 
              perform, by the processor, a random sampling process to randomly
sample representative points of features from the cluster, wherein a number of
representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster.
However, Conway teaches metadata including owner and access traits (see page 19, second paragraph…system metadata can include characteristics of files and collections, such as ownership, access control relationships, provenance, and collection membership).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic and EATON to add metadata including owner and access traits to the combination system of Stojanovic and EATON’s system as taught by Conway above.  The modification would have been obvious because one of ordinary skill would be motivated to have metadata information that improves the accuracy of infrastructure metrics, as suggested by Conway (see page 19, second paragraph).
But Stojanovic, EATON and Conway fail to explicitly teach:
for each cluster of the plurality of clusters: 
             determine, by the processor, a percentage of the features of the data in the
     cluster; and 
              perform, by the processor, a random sampling process to randomly
sample representative points of features from the cluster, wherein a number of
representative points of features randomly sampled from the cluster is proportional to the percentage of the features of the data in the cluster;
            process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters, and apply, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential; and
           generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
Furuichi teaches apply, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential (paragraph [033]… label determination section 214 about the availability of a confidential label of data related to the user operation…; [0036]-[0038]… determining a confidential label, and provides it to the label determination section (214)…confidential label with the highest confidentiality level among the confidential labels obtained by the determinations to be the confidential label of the data…the label determination section (214) determines a confidential label corresponding to the data based on policy information acquired by the policy reference section (215) ; [0126] … a word list of text (e.g. a dictionary)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway and Datta to add a confidentiality label to the combination system of Stojanovic, EATON, Conway and Datta as taught by above.  The modification would have been obvious because one of ordinary skill would be motivated to categorize the content into a predetermined type based on labeling policy information, as suggested by Furuichi ([0036]).
But Stojanovic, EATON, Conway, Datta and Furuichi fail to explicitly teach:
             process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters; and
              generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features, and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.
However, Lingenfelder, in combination with Stojanovic, EATON, Conway, Datta and Furuichi, teaches:
              process, by the processor, file content of each representative point of features randomly sampled from each cluster of the plurality of clusters (paragraph [0047] …for each cluster of training data in the set, a prediction model is created using only the data points in that cluster…); and
             generate, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features (paragraphs [0044]- [0048] …training data will be referred to herein as training data points. Each training data point is a set of covariates together with a known prediction gathered from historical data (for instance, it is known from past data that a certain amount of water consumption in a building occurred in the past at a certain time/day of the week)….prediction model is created for each cluster of training data points-multiple m clusters are then clustered into "prediction clusters," thus each prediction cluster might have multiple prediction models associated therewith; [0049]… Monte-Carlo sampling….; [0050]…on a prediction, the union of all tags associated to any clusters belonging to a prediction cluster is returned using any ranking scheme…; wherein using the broadest reasonable interpretation, Examiner interprets the training data points and union of all tags to include confidential labels); and 
          train a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified (paragraphs [0017]-[0021]…Training data - data on which the model is trained. Training data consists of a set of data points…Data clusters - clusters of the training data. On each
cluster a prediction model is trained…final prediction is determined based on majority vote).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta and Furuichi to add a trained prediction model to the combination system of Stojanovic, EATON, Conway, Datta and Furuichi as taught by Lingenfelder above.  The modification would have been obvious because one of ordinary skill would be motivated to enhance the accuracy of the predictions produced, as suggested by Lingenfelder ([0034]).


Claims 2 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over 
Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall).

As to claim 2, which incorporates the rejection of claim 1, Stojanovic, EATON, Conway, and Datta fail to explicitly teach:
progressively sampling the features of the data in each cluster of the plurality of clusters by incrementing a sampling size in each cluster of the plurality of clusters.
However, Hall teaches progressively sampling the features of the data in each cluster of the plurality of clusters by incrementing a sampling size in each cluster of the plurality of clusters (see Fig. 10, element 1026, paragraphs [0073]…incrementing from the
minimum to the maximum number of clusters or vice versa…; [0094]-[0095], wherein Examiner interprets a next number of clusters is defined by incrementing or decrementing a counter of the number of clusters from the minimum number of clusters or the maximum number of clusters, respectively to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder to increment a cluster sampling size s to the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder, as taught by Hall above.  The modification would have been obvious because one of ordinary skill would be motivated to enable users to concurrently manage data, transform variables, perform exploratory analysis, and build and compare models, as suggested by Hall ([0157]).

As to claim 10, which incorporates the rejection of claim 9, Stojanovic, EATON, Conway Datta, Furuichi and Lingenfelder fail to explicitly teach program instructions executable by the processor to cause the processor to:
progressively sampling the features of the data in each cluster of the plurality of
clusters by incrementing a sampling size in each cluster of the plurality of clusters,
wherein the combined representative points of features is a union of representative points of features selected from each cluster of the plurality of clusters.
However, Hall teaches program instructions executable by the processor to cause the processor to:
progressively sample the features of the data in each cluster of the plurality of clusters by incrementing a sampling size in each cluster of the plurality of clusters (see Fig. 10, element 1026, paragraphs [0073]…incrementing from the minimum to the maximum number of clusters or vice versa…; [0094]-[0095], wherein Examiner interprets a next number of clusters is defined by incrementing or decrementing a counter of the number of clusters from the minimum number of clusters or the maximum number of clusters, respectively to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder to increment a cluster sampling size s to the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder, as taught by Hall above.  The modification would have been obvious because one of ordinary skill would be motivated to enable users to concurrently manage data, transform variables, perform exploratory analysis, and build and compare models, as suggested by Hall ([0157]).
Claims 3, 11, and 17 are rejected under 35 U.S.C. 103 as being unpatentable Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), , and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Figueroa et al. (“Predicting sample size required for classification performance,” hereinafter referred to as Figueroa), and Kripalani et al. (US 2013/0332685 A1, hereinafter referred to as Kripalani).

As to claim 3, which incorporates the rejection of claim 2, Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall fail to explicitly teach wherein the progressively sampling continues until a prediction accuracy threshold is met by training the prediction model using the ML training data or until a sampling memory usage threshold has been met, wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control.
Figueroa teaches wherein progressively sampling continues until a prediction accuracy threshold is met by training a prediction model using the sampled features or until a sampling memory usage threshold has been met (see page 2 of 10, right column, the classification performance increases rapidly with an increase in the size of the training set; the second section is characterized by a turning point where the increase in performance is less rapid and a final section where the classifier has reached its efficiency threshold, i.e. no (or only marginal) improvement in performance is observed with increasing training set size). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall, to add a sampling prediction accuracy threshold to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall, as taught by Figueroa above.  The modification would have been obvious because one of ordinary skill would be motivated to have a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves, as suggested by Figueroa (Abstract).
But Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall and Figueroa fail to explicitly teach the storage infrastructure metrics further include access frequency, and the file metrics further include role based access control.
However, Kripalani teaches wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control (see paragraph [0073] …frequency of change (e.g., a period in which the data object is modified…file location within a file folder directory structure, user permissions, owners, groups, access control lists [ACLs]), system metadata (e.g., registry information)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, and Figueroa to add access frequency and role based access control to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, and Figueroa, as taught by Kripalani above.  The modification would have been obvious because one of ordinary skill would be motivated to improve user access to data files across multiple computing devices and/or hosted services, as suggested by Kripalani ([0082]).

As to claim 11, which incorporates the rejection of claim 10, Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall fail to explicitly teach wherein the progressively sampling continues until a prediction accuracy threshold is met by training the prediction model using the training data or until a sampling memory usage threshold has been met, wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control.
Figueroa teaches wherein progressively sampling continues until a prediction accuracy threshold is met by training a prediction model using the sampled features or until a sampling memory usage threshold has been met (see page 2 of 10, right column, the classification performance increases rapidly with an increase in the size of the training set; the second section is characterized by a turning point where the increase in performance is less rapid and a final section where the classifier has reached its efficiency threshold, i.e. no (or only marginal) improvement in performance is observed with increasing training set size). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall to add a sampling prediction accuracy threshold to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder and Hall, as taught by Figueroa above.  The modification would have been obvious because one of ordinary skill would be motivated to have a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves, as suggested by Figueroa (Abstract).
But Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa fail to explicitly teach the storage infrastructure metrics further include access frequency, and the file metrics further include role based access control.
However, Kripalani teaches wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control (see paragraph [0073] …frequency of change (e.g., a period in which the data object is modified…file location within a file folder directory structure, user permissions, owners, groups, access control lists [ACLs]), system metadata (e.g., registry information)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa to add access frequency and role based access control to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa, as taught by Kripalani above.  The modification would have been obvious because one of ordinary skill would be motivated to improve user access to data files across multiple computing devices and/or hosted services, as suggested by Kripalani ([0082]).

As to claim 17, which incorporates the rejection of claim 16, Stojanovic, Conway, Datta, Furuichi, Hall and Lingenfelder fail to explicitly teach wherein the progressively sampling continues until a prediction accuracy threshold is met by training the prediction model using the training data or until a sampling memory usage threshold has been met, wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control.
Figueroa teaches wherein progressively sampling continues until a prediction accuracy threshold is met by training the prediction model using the sampled features or until a sampling memory usage threshold has been met (see page 2 of 10, right column, the classification performance increases rapidly with an increase in the size of the training set; the second section is characterized by a turning point where the increase in performance is less rapid and a final section where the classifier has reached its efficiency threshold, i.e. no (or only marginal) improvement in performance is observed with increasing training set size). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, Hall and Lingenfelder to add a sampling prediction accuracy threshold to the combination system of Stojanovic, Conway, Datta, Furuichi, Hall and Lingenfelder, as taught by Figueroa above.  The modification would have been obvious because one of ordinary skill would be motivated to have a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves, as suggested by Figueroa (Abstract).
But Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa fail to explicitly teach the storage infrastructure metrics further include access frequency, and the file metrics further include role based access control.
However, Kripalani teaches wherein the storage infrastructure metrics further include access frequency, and wherein the file metrics further include role based access control (see paragraph [0073] …frequency of change (e.g., a period in which the data object is modified…file location within a file folder directory structure, user permissions, owners, groups, access control lists [ACLs]), system metadata (e.g., registry information)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa to add access frequency and role based access control to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Hall, Lingenfelder and Figueroa, as taught by Kripalani above.  The modification would have been obvious because one of ordinary skill would be motivated to improve user access to data files across multiple computing devices and/or hosted services, as suggested by Kripalani ([0082]).

Claims 4-5 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and  Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Figueroa et al. (“Predicting sample size required for classification performance,” hereinafter referred to as Figueroa), and Kripalani et al. (US 2013/0332685 A1, hereinafter referred to as Kripalani), and Tung et al. (US 2010/0125473 A1, hereinafter referred to as Tung).

 As to claim 4, which incorporates the rejection of claim 3, Stojanovic teaches wherein the prediction model predicts one of the following categories for the one or more types of data: classified, unclassified, private, or public (see paragraphs [0021] …
complete classification of data by apply unsupervised machine learning techniques in combination with merging multiple sources for supervised machine learning…; [0023] ….combine unsupervised learning techniques with supervised learning techniques to more accurately label categories of input data…; [0068]…the data sources can include a public cloud storage service 311, a private cloud storage service 313, various other cloud services 315, a URL or web-based data source 317, or any other accessible data source…). 
But Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani fail to explicitly teach:
 the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage.
However, Tung teaches:
  the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage (see Figs. 14 and 14A,  element 1450;  paragraph [0079], cloud computing opportunity score may indicate whether a cloud computing service exists capable of hosting the computing component, whereas the cloud computing readiness score may indicate whether the computing component is ready for a transition to cloud computing; [0080], a total cloud computing opportunity score and a total cloud computing readiness score; [0086]-[0090]; [0011]-[0014], recommendations field 1455). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani to add a cloud-readiness recommendation to the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani as taught by Tung above.  The modification would have been obvious because one of ordinary skill would be motivated to generate a strategy for transitioning the computing component to the cloud computing environment, thus reducing the energy consumption of a data center, and hence performing cost savings, time-to-market, adaptability and providing improved computing capabilities, as suggested by Tung ([0043]).

As to claim 5, Stojanovic teaches wherein the prediction model predicts one of the following categories for the one or more types of data: classified, unclassified, private, or public (paragraphs [0021]… complete classification of data by apply unsupervised machine learning techniques in combination with merging multiple sources for supervised machine learning…; [0023] ….combine unsupervised learning techniques with supervised learning techniques to more accurately label categories of input data…; [0068]…the data sources can include a public cloud storage service 311, a private cloud storage service 313, various other cloud services 315, a URL or web-based data source 317, or any other accessible data source…; ). 

As to claim 12, which incorporates the rejection of claim 11, Stojanovic teaches wherein the prediction model predicts one of the following categories for the one or more types of data: classified, unclassified, private, or public (see paragraphs [0021] …
complete classification of data by apply unsupervised machine learning techniques in combination with merging multiple sources for supervised machine learning…; [0023] ….combine unsupervised learning techniques with supervised learning techniques to more accurately label categories of input data…; [0068]…the data sources can include a public cloud storage service 311, a private cloud storage service 313, various other cloud services 315, a URL or web-based data source 317, or any other accessible data source…). 
But Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani fail to explicitly teach:
 the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage.
However, Tung teaches:
  the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage (see Figs. 14 and 14A,  element 1450;  paragraph [0079], cloud computing opportunity score may indicate whether a cloud computing service exists capable of hosting the computing component, whereas the cloud computing readiness score may indicate whether the computing component is ready for a transition to cloud computing; [0080], a total cloud computing opportunity score and a total cloud computing readiness score; [0086]-[0090]; [0011]-[0014], recommendations field 1455). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani to add a cloud-readiness recommendation to the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani as taught by Tung above.  The modification would have been obvious because one of ordinary skill would be motivated to generate a strategy for transitioning the computing component to the cloud computing environment, thus reducing the energy consumption of a data center, and hence performing cost savings, time-to-market, adaptability and providing improved computing capabilities, as suggested by Tung ([0043]).

Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over 
Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,” hereinafter referred to as Conway), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Muff at et al. (US 2020/0250241 A1, hereinafter referred to as Muffat), and Liu et al. (US 2002/0188579 A1, hereinafter referred to as Liu)., and Yitshak (US 2014/0214407 A1, hereinafter referred to as Yitshak).

As to claim 6, which incorporates the rejection of claim 1, Conway teaches wherein the
one or more types of data include at least one of business data or social networking data (page 250… 4.1 Types of Cited Social Media…social media sources were those related to social networking sites (such as Facebook, Twitter and Mendeley) and multimedia sharing communities (like YouTube and Flickr), with Twitter and Academic.edu being the least cited social media (citations to these two types
occurred only once).
However, Stojanovic, Conway, Datta, Furuichi, and Lingenfelder fail to explicitly teach wherein instead of entire file content of each file in the entire storage stack, the storage infrastructure metrics, the file metrics, and the application dependency taxonomy are used with a representative example of the file content that includes partial content of the file content for reducing sampling processing time and memory requirements.
Muffat teaches a representative example of file content that includes partial content of the file content (see paragraphs [0007]-[0008] ... clustering module formulates representative subsets of the sampled documents ... ; [0021 ] ... representative sampling; [0025] ... metadata features 105 in the extracted metadata is utilized to cluster the documents by weighted clustering 106. The weighted clustering 106 of the documents is determined in accordance ...; [0034]-[0035] and [0056] ...Files from each cluster are sampled equally 424 and content clustering is applied 426 on the sampled files).
Liu teaches wherein instead of entire file content of each file in the entire storage stack,  the storage infrastructure metrics, file metrics and application dependency taxonomy for each file are used with a representative example of the file content for reducing sampling processing time and memory requirements (see paragraph [0092], wherein Examiner interprets “the random subsample” as a representative example, and “the random smaller sample provides an appropriate sampling of the data, while also reducing the processing time” to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder to add the sampling processing time reduction to the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder, as taught by Liu above.  The modification would have been obvious because one of ordinary skill would be motivated to have a random sub-sample that reduces the computational requirements, as suggested by Liu ([0011]).

As to claim 13, which incorporates the rejection of claim 9, Conway teaches wherein the one or more types of data include at least one of business data or social networking data (page 250… 4.1 Types of Cited Social Media…social media sources were those related to social networking sites (such as Facebook, Twitter and Mendeley) and multimedia sharing communities (like YouTube and Flickr), with Twitter and Academic.edu being the least cited social media (citations to these two types
occurred only once).
However, Stojanovic, Conway, Datta, Furuichi, and Lingenfelder fail to explicitly teach wherein instead of entire file content of each file in the entire storage stack, the storage infrastructure metrics, the file metrics and the application dependency taxonomy are used with a representative example of the file content that includes partial content of the file content for reducing sampling processing time and memory requirements.
Muffat teaches a representative example of file content that includes partial content of the file content (see paragraphs [0007]-[0008] ... clustering module formulates representative subsets of the sampled documents ... ; [0021 ] ... representative sampling; [0025] ... metadata features 105 in the extracted metadata is utilized to cluster the documents by weighted clustering 106. The weighted clustering 106 of the documents is determined in accordance ...; [0034]-[0035] and [0056] ...Files from each cluster are sampled equally 424 and content clustering is applied 426 on the sampled files).
Liu teaches wherein instead of entire file content of each file in the entire storage stack,  the storage infrastructure metrics, the file metrics and the application dependency taxonomy for each file are used with a representative example of the file content for reducing sampling processing time and memory requirements (see paragraph [0092], wherein Examiner interprets “the random subsample” as a representative example, and “the random smaller sample provides an appropriate sampling of the data, while also reducing the processing time” to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder to add the sampling processing time reduction to the combination system of Stojanovic, Conway, Datta, Furuichi, and Lingenfelder, as taught by Liu above.  The modification would have been obvious because one of ordinary skill would be motivated to have a random sub-sample that reduces the computational requirements, as suggested by Liu ([0011]).

 Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Caraviello et al. (US 20100332430 A1, hereinafter referred to as Caraviello), and Perng et al. (US 2008/0126556 A1, hereinafter referred to as Perng).

As to claim 7, which incorporates the rejection of claim 2, Stojanovic, Eaton, Conway, Datta, Furuichi, Hall and Lingenfelder fail to explicitly teach:
sampling the plurality of clusters with a first sampling percentage;
applying a previous clustering-based sampling to obtain an ML training data set, and combining the ML training data set with any previous a ML training data set;
training a ML model and obtaining a classification accuracy for the ML model on a held-out test data set or using k-fold cross validation on the ML training data set, wherein a training data structure for the ML training data set includes a data identifier, the features of the data based on the data identifier and assigned class labels;
and comparing the classification accuracy with an accuracy from a previous sampling of the data.
Caraviello teaches wherein progressively sampling the plurality of clusters comprises:   sampling the plurality of clusters with a first sampling percentage (see paragraph [0235], wherein Examiner interprets splitting the data into k parts to include a first sampling percentage); 
applying a previous clustering-based sampling to obtain a ML training data set (see paragraph [0252], the data set is divided into k subsets, and the holdout method is repeated k times, wherein Examiner interprets repeating k times to teach the limitation), and combining the ML training data set with any previous training data (see paragraph [0235], wherein Examiner interprets the cumulative cross-validation algorithm starts with an empty data set and adds record by record, updating the state of the network after each additional record to teach the limitation); 
training a ML model and obtaining a classification accuracy for the ML model on a held-out test data set or using k-fold cross validation on the ML training data set (see paragraphs [0249]-[0254], a k-fold cross-validation method is an improvement over the holdout method.  The data set is divided into k subsets, and the holdout method is repeated k times); and 
comparing the classification accuracy with an accuracy from a previous sampling of the data (see paragraphs [0219], the idea here is to select the subset of features that will have the best classification performance when used for building a model with a specific algorithm. Accuracy is evaluated through cross-validation, holdout set, or bootstrap estimator. A model and a set of cross-validation folds must be performed for each subset of features being evaluated). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder and Hall to add a k-fold cross-validation to the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder and Hall, as taught by Caraviello above.  The modification would have been obvious because one of ordinary skill would be motivated to reduce dimensionality by replacing original features with a combination of one or more of the features included in one or more of the association rules, and allows mining of discriminative and essential frequent patterns via model-based search tree, as suggested by Caraviello ([0068]-[0069]).
Perng teaches wherein a training data structure for the ML training data set includes a data identifier, the features of the data based on the data identifier and assigned class labels (see paragraph [0039]…training data 304 to assign a class label. A class label is a label on a data item to indicate which class the data belongs to…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall and Caraviello to add data identifier and assigned class labels to the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall and Caraviello, as taught by Perng above.  The modification would have been obvious because one of ordinary skill would be motivated to have data streams effectively classified and processed using existing models in the form of classifiers to improve efficiency of training and classification despite continuously changing data and concept drifts, as suggested by Perng ([0078]).

As to claim 14, which incorporates the rejection of claim 10, Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder and Hall fail to explicitly teach:
sample the plurality of clusters with a first sampling percentage;
apply a previous clustering-based sampling to obtain an ML training data set, and combining the ML training data set with any previous determined ML training data;
train, by the processor, a ML model and obtaining a classification accuracy for the ML model on a held-out test data set or using k-fold cross validation on the obtained ML training data set, wherein a training data structure for the ML training data set includes a data identifier, the features of the data based on the data identifier and assigned class labels;
and compare the classification accuracy with an accuracy from a previous sampling of the data.
Caraviello teaches wherein progressively sampling the plurality of clusters comprises:   
sampling the plurality of clusters with a first sampling percentage (see paragraph [0235], wherein Examiner interprets splitting the data into k parts to include a first sampling percentage); 
apply a previous clustering-based sampling to obtain an ML training data set (see paragraph [0252], the data set is divided into k subsets, and the holdout method is repeated k times wherein Examiner interprets repeating k times to teach the limitation), and combining the ML training data set with any previous training data set (see paragraph [0235], wherein Examiner interprets the cumulative cross-validation algorithm starts with an empty data set and adds record by record, updating the state of the network after each additional record to teach the limitation); 
train, by the processor, a ML model and obtaining a classification accuracy for the ML model on a held-out test data set or using k-fold cross validation on the obtained ML training data set (see paragraphs [0249]-[0254], a k-fold cross-validation method is an improvement over the holdout method.  The data set is divided into k subsets, and the holdout method is repeated k times); and 
compare the classification accuracy with an accuracy from a previous sampling of the data (see paragraphs [0219], the idea here is to select the subset of features that will have the best classification performance when used for building a model with a specific algorithm. Accuracy is evaluated through cross-validation, holdout set, or bootstrap estimator. A model and a set of cross-validation folds must be performed for each subset of features being evaluated). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder and Hall to add a k-fold cross-validation to the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder and Hall, as taught by Caraviello above.  The modification would have been obvious because one of ordinary skill would be motivated to reduce dimensionality by replacing original features with a combination of one or more of the features included in one or more of the association rules, and allows mining of discriminative and essential frequent patterns via model-based search tree, as suggested by Caraviello ([0068]-[0069]).
Perng teaches wherein a training data structure for the ML training data set includes a data identifier, the features of the data based on the data identifier and assigned class labels (see paragraph [0039] …training data 304 to assign a class label. A class label is a label on a data item to indicate which class the data belongs to…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall and Caraviello to add data identifier and assigned class labels to the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall and Caraviello, as taught by Perng above.  The modification would have been obvious because one of ordinary skill would be motivated to have data streams effectively classified and processed using existing models in the form of classifiers to improve efficiency of training and classification despite continuously changing data and concept drifts, as suggested by Perng ([0078]).

Claims 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Caraviello et al. (US 20100332430 A1, hereinafter referred to as Caraviello), and Perng et al. (US 2008/0126556 A1, hereinafter referred to as Perng), and Wang et al. (US 9,053,391 B2, hereinafter referred to as Wang).

As to claim 8, Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Caraviello, and Perng fail to explicitly teach wherein the progressively sampling further comprises:
upon a determination that the classification accuracy improves over the accuracy
from the previous sampling of the data, perform incremental sampling to a second sampling percentage; and
upon a determination that the classification accuracy converges or does not
improve over the accuracy from the previous sampling of the data, or a total sampling
size is larger than a predetermined sampling size threshold, outputting the sampled data and the trained ML model from a previous progressive sampling iteration.
However, Wang teaches wherein progressively sampling the plurality of clusters further comprises: 
upon a determination that the classification accuracy improves over the accuracy from the previous sampling of the data, perform incremental sampling to a second sampling percentage (see col. 1, lines 26-67, boosting….at each iteration, a weak classifier is added to form a final strong classifier. The weak classifiers are typically weighted by their accuracy.  After a weak classifier is added, the samples are reweighted: the weights of the misclassified samples will be increased, and the samples that are classified correctly will have decreased weights. Weak classifiers that are subsequently added will be trained based on the re-weighted samples, focusing more on the misclassified samples; col. 2, lines 5-24…when the tree is built, it may be pruned using cross-validation procedure); and    
upon a determination that the classification accuracy converges or does not improve over the accuracy from the previous sampling of the data, or a total sampling size is larger than a predetermined sampling size threshold, outputting the sampled data and the trained ML model from a previous progressive sampling iteration (see col. 2, lines 25-32…if the value of the variable is less the threshold. This pair is called split. Once a leaf node is reached, the value assigned to this node is used as the output of prediction procedure.  Examiner interprets “the variable “as the sampling size; col. 6, lines 4-65… If the error is below some threshold, or doesn't change a lot compared with previous iteration, exit the algorithm, return the up-to-date cluster centers; col. 7, lines 49-60…co-training procedure may continue, until some convergence criteria is met), wherein the trained prediction model comprises the trained ML model (col. 1, lines 50-67…, wherein using the broadest reasonable interpretation, Examiner interprets “…a weak classifier is trained with respect to the samples and the associated weights.  At each iteration, a weak classifier is added to form a final strong classifier “… col. 3, lines 60-67 to col. 4, lines 1-2…The model trained with the initial training samples may be updated with the newly added samples, so that the updated “to teach the limitation). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Lingenfelder, Hall, Caraviello, and Perng to add incremental sampling to the combination system of Stojanovic, Eaton, Conway, Datta, Lingenfelder, Hall, Caraviello, and Perng, as taught by Wang above.  The modification would have been obvious because one of ordinary skill would be motivated to expanding models technique and/or the updating the initial model technique may be combined together to achieve improved performance, as suggested by Wang (lines 50-54).

As to claim 15, Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Caraviello, and Perng fail to explicitly teach wherein the progressively sampling further comprises:
upon a determination that the classification accuracy improves over the accuracy
from the previous sampling of the data, perform incremental sampling to a second sampling percentage; and
upon a determination that the classification accuracy converges or does not
improve over the accuracy from the previous sampling of the data, or a total sampling
size is larger than a predetermined sampling size threshold, outputting the sampled data and the trained ML model from a previous progressive sampling iteration.
However, Wang teaches wherein progressively sampling the plurality of clusters further comprises: 
upon a determination that the classification accuracy improves over the accuracy from the previous sampling of the data, perform incremental sampling to a second sampling percentage (see col. 1, lines 26-67, boosting….at each iteration, a weak classifier is added to form a final strong classifier. The weak classifiers are typically weighted by their accuracy.  After a weak classifier is added, the samples are reweighted: the weights of the misclassified samples will be increased, and the samples that are classified correctly will have decreased weights. Weak classifiers that are subsequently added will be trained based on the re-weighted samples, focusing more on the misclassified samples; col. 2, lines 5-24…when the tree is built, it may be pruned using cross-validation procedure); and    
upon a determination that the classification accuracy converges or does not improve over the accuracy from the previous sampling of the data, or a total sampling size is larger than a predetermined sampling size threshold, outputting the sampled data and the trained ML model from a previous progressive sampling iteration (see col. 2, lines 25-32…if the value of the variable is less the threshold. This pair is called split. Once a leaf node is reached, the value assigned to this node is used as the output of prediction procedure.  Examiner interprets “the variable “as the sampling size; col. 6, lines 4-65… If the error is below some threshold, or doesn't change a lot compared with previous iteration, exit the algorithm, return the up-to-date cluster centers; col. 7, lines 49-60…co-training procedure may continue, until some convergence criteria is met),
wherein the trained prediction model comprises the trained ML model (col. 1, lines 50-67…, wherein using the broadest reasonable interpretation, Examiner interprets “…a weak classifier is trained with respect to the samples and the associated weights.  At each iteration, a weak classifier is added to form a final strong classifier “… col. 3, lines 60-67 to col. 4, lines 1-2…The model trained with the initial training samples may be
updated with the newly added samples, so that the updated “to teach the limitation).. 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Caraviello, and Perng to add incremental sampling to the combination system of Stojanovic, Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Caraviello, and Perng as taught by Wang above.  The modification would have been obvious because one of ordinary skill would be motivated to expanding models technique and/or the updating the initial model technique may be combined together to achieve improved performance, as suggested by Wang (lines 50-54).

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Muffat et al. (US 2020/0250241 A1, hereinafter referred to as Muffat), and Tung et al. (US 2010/0125473 A1, hereinafter referred to as Tung), and Liu et al. (US 2002/0188579 A1, hereinafter referred to as Liu), and Yitshak (US 2014/0214407 A1, hereinafter referred to as Yitshak).

As to claim 18, Stojanovic teaches wherein the prediction model predicts one of the following categories for the one or more types of data: classified, unclassified, private, or public (see paragraphs [0021] …complete classification of data by apply unsupervised machine learning techniques in combination with merging multiple sources for supervised machine learning…; [0023] ….combine unsupervised learning techniques with supervised learning techniques to more accurately label categories of input data…; [0068]…the data sources can include a public cloud storage service 311, a private cloud storage service 313, various other cloud services 315, a URL or web-based data source 317, or any other accessible data source…). 
But Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani fail to explicitly teach:
 the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage.
However, Tung teaches:
  the prediction model is used to perform a cloud-readiness recommendation for moving the one or more types of data offsite to cloud-based storage (see Figs. 14 and 14A,  element 1450;  paragraph [0079], cloud computing opportunity score may indicate whether a cloud computing service exists capable of hosting the computing component, whereas the cloud computing readiness score may indicate whether the computing component is ready for a transition to cloud computing; [0080], a total cloud computing opportunity score and a total cloud computing readiness score; [0086]-[0090]; [0011]-[0014], recommendations field 1455). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani to add a cloud-readiness recommendation to the combination system of Stojanovic, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani as taught by Tung above.  The modification would have been obvious because one of ordinary skill would be motivated to generate a strategy for transitioning the computing component to the cloud computing environment, thus reducing the energy consumption of a data center, and hence performing cost savings, time-to-market, adaptability and providing improved computing capabilities, as suggested by Tung ([0043]). 
Muffat teaches a representative example of the file content that includes partial content of the file content (see paragraphs [0007]-[0008] ... clustering module formulates representative subsets of the sampled documents ... ; [0021 ] ... representative sampling; [0025] ... metadata features 105 in the extracted metadata is utilized to cluster the documents by weighted clustering 106. The weighted clustering 106 of the documents is determined in accordance ...; [0034]-[0035] and [0056] ...Files from each cluster are sampled equally 424 and content clustering is applied 426 on the sampled files).
It would have been obvious to one of ordinary skill in the art before the effective filing of
the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Lingenfelder, Hall, Figueroa, and Kripalani to add partial content of the file content to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, and Kripalani, as taught by Muffat above. The modification would have been obvious because one of ordinary skill would be motivated to reduce the cost of computation resources, as suggested by Muffat ([0047]).
But Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat and Tung fail to explicitly teach:
instead of entire file content of each file in the entire storage stack, the storage infrastructure metrics, the file metrics and the application dependency taxonomy for each file are used with a representative example of the file content for reducing sampling processing time and memory requirements.
However, Liu teaches wherein instead of entire file content of each file in the entire storage stack,  the storage infrastructure metrics, the file metrics and the application dependency taxonomy for each file are used with a representative example of the file content for reducing sampling processing time and memory requirements (see paragraph [0092], wherein Examiner interprets “the random subsample” as a representative example, and “the random smaller sample provides an appropriate sampling of the data, while also reducing the processing time” to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat and Tung to add the sampling processing time reduction to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat and Tung as taught by Liu above.  The modification would have been obvious because one of ordinary skill would be motivated to have a random sub-sample that reduces the computational requirements, as suggested by Liu ([0011]).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Figueroa et al. (“Predicting sample size required for classification performance,” hereinafter referred to as Figueroa), and Kripalani et al. (US 2013/0332685 A1, hereinafter referred to as Kripalani), and Muffat et al. (US 2020/0250241 A 1, hereinafter referred to as Muffat), and Tung et al. (US 2010/0125473 A1, hereinafter referred to as Tung), and Liu et al. (US 2002/0188579 A1, hereinafter referred to as Liu), and Yitshak (US 2014/0214407 A1, hereinafter referred to as Yitshak), and Caraviello et al. (US 20100332430 A1, hereinafter referred to as Caraviello), and Perng et al. (US 2008/0126556 A1, hereinafter referred to as Perng).

As to claim 19, Caraviello teaches wherein: 
the sampling processor is further configured to: 
       sample the plurality of clusters with a first sampling percentage (see paragraph [0235], wherein Examiner interprets splitting the data into k parts to include a first sampling percentage); 
        apply a previous clustering-based sampling to obtain a ML training data set, and combining the training data set with any previous ML training data (see paragraph [0252], the data set is divided into k subsets, and the holdout method is repeated k times wherein Examiner interprets repeating k times to teach the limitation), and 
the ML processor is further configured to: 
         train an ML model and obtain a classification accuracy for the ML model on a held-out test data set or using k-fold cross validation on the obtained training data set (see paragraphs [0249]-[0254], a k-fold cross-validation method is an improvement over the holdout method.  The data set is divided into k subsets, and the holdout method is repeated k times); and 
comparing the classification accuracy with an accuracy from a previous sampling of the data (see paragraphs [0219], the idea here is to select the subset of features that will have the best classification performance when used for building a model with a specific algorithm. Accuracy is evaluated through cross-validation, holdout set, or bootstrap estimator. A model and a set of cross-validation folds must be performed for each subset of features being evaluated). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat, Tung and Liu to add a k-fold cross-validation to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, KAZAMA, Figueroa, Kripalani, Muffat, Tung and Liu, as taught by Caraviello above.  The modification would have been obvious because one of ordinary skill would be motivated to reduce dimensionality by replacing original features with a combination of one or more of the features included in one or more of the association rules, and allows mining of discriminative and essential frequent patterns via model-based search tree, as suggested by Caraviello ([0068]- [0069]).
Perng teaches wherein a training data structure for the ML training data set includes a data identifier, the features of the data based on the data identifier and assigned class labels (see paragraph [0039] …training data 304 to assign a class label. A class label is a label on a data item to indicate which class the data belongs to…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat, Tung, Liu and Caraviello to add data identifier and assigned class labels to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat, Tung, Liu and Caraviello, as taught by Perng above.  The modification would have been obvious because one of ordinary skill would be motivated to have data streams effectively classified and processed using existing models in the form of classifiers to improve efficiency of training and classification despite continuously changing data and concept drifts, as suggested by Perng ([0078]).

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Stojanovic et al. (US 2016/0092557 A1, hereinafter referred to as Stojanovic), in view of EATON et al. (US 2015/0082432 A1, hereinafter referred to as EATON), and further in view of Conway et al. (“Metadata and Semantics Research: Advancing the DFC Semantic Technology Platform via HIVE Innovation,”  hereinafter referred to as Conway), and Datta et al. (US 2014/0003708 A1, hereinafter referred to as Datta), and Furuichi et al. (US 2012/0166442 A1, hereinafter referred to as Furuichi), and Lingenfelder et al. (US 2014/0180992 A1, hereinafter referred to as Lingenfelder), and Hall et al. (US 2016/0048577 A1, hereinafter referred to as Hall), and Figueroa et al. (“Predicting sample size required for classification performance,” hereinafter referred to as Figueroa), and Kripalani et al. (US 2013/0332685 A1, hereinafter referred to as Kripalani), and Muffat et al. (US 2020/0250241 A1, hereinafter referred to as Muffat), and Tung et al. (US 2010/0125473 A1, hereinafter referred to as Tung), and Liu et al. (US 2002/0188579 A1, hereinafter referred to as Liu), and Yitshak (US 2014/0214407 A1, hereinafter referred to as Yitshak), and Caraviello et al. (US 20100332430 A1, hereinafter referred to as Caraviello), and Perng et al. (US 2008/0126556 A1, hereinafter referred to as Perng), and Wang et al. US 9,053,391 B2, hereinafter referred to as Wang).

As to claim 20, Wang teaches wherein the ML processor is further configured to:   
upon a determination that the classification accuracy improves over the accuracy from the previous sampling of the data, perform incremental sampling to a second sampling percentage (see col. 1, lines 26-67, boosting….at each iteration, a weak classifier is added to form a final strong classifier. The weak classifiers are typically weighted by their accuracy.  After a weak classifier is added, the samples are reweighted: the weights of the misclassified samples will be increased, and the samples that are classified correctly will have decreased weights. Weak classifiers that are subsequently added will be trained based on the re-weighted samples, focusing more on the misclassified samples; col. 2, lines 5-24…when the tree is built, it may be pruned using cross-validation procedure); and 
upon a determination that the classification accuracy converges or does not improve over the accuracy from the previous sampling of the data, or a total sampling size is larger than a predetermined sampling size threshold, output the sampled data and the trained ML model from a previous progressive sampling iteration (see col. 2, lines 25-32…if the value of the variable is less the threshold. This pair is called split. Once a leaf node is reached, the value assigned to this node is used as the output of prediction procedure.  Examiner interprets “the variable “as the sampling size; col. 6, lines 4-65… If the error is below some threshold, or doesn't change a lot compared with previous iteration, exit the algorithm, return the up-to-date cluster centers; col. 7, lines 49-60…co-training procedure may continue, until some convergence criteria is met), wherein the trained prediction model comprises the trained ML model (col. 1, lines 50-67…, wherein using the broadest reasonable interpretation, Examiner interprets “…a weak classifier is trained with respect to the samples and the associated weights.  At each iteration, a weak classifier is added to form a final strong classifier “… col. 3, lines 60-67 to col. 4, lines 1-2…The model trained with the initial training samples may be updated with the newly added samples, so that the updated “to teach the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat, Tung, Liu and Caraviello to add incremental sampling to the combination system of Stojanovic, EATON, Conway, Datta, Furuichi, Lingenfelder, Hall, Figueroa, Kripalani, Muffat, Tung, Liu and Caraviello, as taught by Wang above.  The modification would have been obvious because one of ordinary skill would be motivated to expanding models technique and/or the updating the initial model technique may be combined together to achieve improved performance, as suggested by Wang (lines 50-54).

Response to Applicant’s arguments
Applicant's arguments on file on 11/22/2021 with respect to claims 1-20 have been considered and are not persuasive.

Rejections under 35 U.S.C. § 103
Claims 1, 9 and 16
Argument 1
Applicant appears to assert that Stojanovic, Eaton, Conway, Datta, Furuichi, and Lingenfelder, whether considered separately or in combination, fails to teach or suggest "applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the file content, and the confidentiality label is indicative of whether the file content is confidential;
generating, by the processor, machine learning (ML) training data, wherein the training
data comprises a union of each representative point of features randomly sampled from
each cluster of the plurality of clusters and a confidentiality label corresponding to the
representative point of features; and training a prediction model based on the ML
training data, wherein the trained prediction model predicts whether one or more types
of data are classified' (emphasis added) as per independent claim 1 and similarly per
independent claims 9 and 16.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, and Lingenfelder does not teach or suggest all the limitations of Applicant's independent claims 1, 9, and 16. Applicant's independent claims 1, 9, and 16 are not obvious over Stojanovic in view
of Eaton, Conway, Datta, Furuichi, and Lingenfelder.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 1, 9, and 16 is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Furuichi teaches:
 applying, by the processor, pattern matching to the file content to determine a confidentiality label corresponding to the representative point of features, wherein the pattern matching comprises finding keywords selected from a predefined dictionary in the filePage 2 of 38U.S. Patent Application No. 14/943,915Docket No. ARC920150056US1 Amendment dated November 22, 2021Reply to Non-Final Office Action of November 3, 2021content, and the confidentiality label is indicative of whether the file content is confidential (paragraphs [033] and [0036]-[0038]… [0126] … a word list of text (e.g. a dictionary)).
Lingenfelder teaches:
                          generating, by the processor, machine learning (ML) training data, wherein the training data comprises a union of each sampled representative point of features randomly sampled from each cluster of the plurality of clusters and a confidentiality label corresponding to the representative point of features (paragraphs [0044]- [0048] …training data will be referred to herein as training data points. Each training data point is a set of covariates together with a known prediction gathered from historical data (for instance, it is known from past data that a certain amount of water consumption in a building occurred in the past at a certain time/day of the week)….prediction model is created for each cluster of training data points-multiple m clusters are then clustered into "prediction clusters," thus each prediction cluster might have multiple prediction models associated therewith; [0049]… Monte-Carlo sampling….; [0050]…on a prediction, the union of all tags associated to any clusters belonging to a prediction cluster is returned using any ranking scheme…; wherein using the broadest reasonable interpretation, Examiner interprets the training data points and union of all tags to include confidential labels); and 
          training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified (paragraphs [0017]-[0021]…Training data - data on which the model is trained. Training data consists of a set of data points…Data clusters - clusters of the training data. On each
cluster a prediction model is trained…final prediction is determined based on majority vote).
Therefore, Furuichi, and Lingenfelder teach the limitation above.  Stojanovic, Eaton, Conway, Datta, Furuichi, and Lingenfelder teach all the limitation of claim 1, 9, and 16. Accordingly, the 35 U.S.C. § 103 rejections for claims 1, 9 and 16 are respectfully maintained.

Argument 2
Claims 2 and 10
Applicant appears to assert that neither Hall nor Kazama, however, discloses: (1) applying pattern matching to file content of a representative point of features to determine a confidentiality label corresponding to the representative point of features, where the pattern matching comprises finding keywords selected from a predefined dictionary in the file content, and the confidentiality label is indicative of whether the file content is confidential, (2) generating machine learning (ML) training data, where the training data comprises a union of each representative point of features randomly sampled from each cluster of a plurality of clusters and a confidentiality label corresponding to the representative point of features, and (3) training a prediction model based on the ML training data, wherein the trained prediction model predicts whether one or more types of data are classified.

Examiner response:
Examiner respectfully disagrees.  Furuichi, and Lingenfelder do teach the claimed limitations above in response to argument 1.
Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, and Kazama does teach or suggest all the limitations of Applicant's independent claims 1 and 9.
Applicant's independent claims 1 and 9 are obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, and Kazama.
Additionally, the claims that depend on independent claims 1 and 9, namely claim 2, and claim 10, respectively, are not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, and Kazama for at least the same reasons.
Accordingly, the 35 U.S.C. §103 rejection of claims 2 and 10 is respectfully maintained.
  
Argument 3
Claims 3, 11 and 17
Claim 3 depends on independent claim 1. Claim 11 depends on independent claim 9. Claim 17 depends on independent claim 16. As asserted above, Stojanovic in view of
Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, and Kazama fails to teach or suggest all the claimed limitations of independent claims 1, 9, and 16.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall,
Kazama, Figueroa, and Kripalani does not teach or suggest all the limitations of
Applicant's independent claims 1, 9, and 16, Applicant's independent claims 1, 9, and 16 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, and Kripalani. Additionally, the claims that depend on independent claims 1 and 9, namely claim 2, and claim 10, respectively, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder,
Hall, Kazama, Figueroa, and Kripalani for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 3, 11, and 17
is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, and Kripalani does teach or suggest all the limitations of Applicant's independent claims 1, 9, and 16.
Applicant's independent claims 1, 9, and 16 are obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, and Kripalani. Additionally, the claims that depend on independent claims 1 and 9, namely claim 2, and claim 10, respectively, are also not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, and Kripalani for at least the same reasons.
Accordingly, the 35 U.S.C. § 103 rejections for claims 3, 11, and 17 are respectfully maintained.

Argument 4
Claims 4, 5 and 12
The rejection of claims 4-5 and 12 under 35 U.S.C. §103 as allegedly being
unpatentable over Stojanovic in view of Eaton, Conway, Datta, Hall, Kazama, Figueroa,
Kripalani, and Tung is respectfully traversed because for at least the following reasons,
Stojanovic, Eaton, Conway, Datta, Hall, Kazama, Figueroa, Kripalani, and Tung, whether considered separately or in combination, fails to teach or suggest all of the claimed limitations.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, and Tung does not teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, and Tung. Additionally, the claims that depend on independent claims 1 and 9, namely claims 4-5, and claim 12, respectively, are also
patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder,
Hall, Kazama, Figueroa, Kripalani, and Tung for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 4-5 and 12 is
respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, and Tung does teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, and Tung. Additionally, the claims that depend on independent claims 1 and 9, namely claims 4-5, and claim 12, respectively, are also not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, and Tung for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 4-5 and 12 is
respectfully maintained.

Argument 5
Claims 6 and 13
Claim 6 depends on independent claim 1. Claim 13 depends on independent claim
9. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi,
Lingenfelder, Muff at, and Liu fails to teach or suggest all the claimed limitations of
independent claims 1 and 9.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muffat, and Liu does not teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims I and 9 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muffat, and Liu. Additionally, the claims that depend on independent claims I and 9, namely claim 6, and claim 13, respectively, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muff at, and Liu for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 6 and 13 is respectfully requested.
Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muffat, and Liu does teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muffat, and Liu. Additionally, the claims that depend on independent claims 1 and 9, namely claim 6, and claim 13, respectively, are also not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Muff at, and Liu for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 6 and 13 is respectfully maintained.

Argument 6
Claims 7 and 14
Claim 7 depends on independent claim 1. Claim 14 depends on independent claim 9. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi,
Lingenfelder, and Hall fails to teach or suggest all the claimed limitations of independent
claims 1 and 9.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall,
Kazama, Caraviello, and Perng does not teach or suggest all the limitations of
Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are not
obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall,
Kazama, Caraviello, and Perng. Additionally, the claims that depend on independent
claims 1 and 9, namely claim 7, and claim 14, respectively, are also patentable over
Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama,
Caraviello, and Perng for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 7 and 14 is
respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, and Perng does teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, and Perng. Additionally, the claims that depend on independent claims 1 and 9, namely claim 7, and claim 14, respectively, are not also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, and Perng for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 7 and 14 is
respectfully maintained.

Argument 7
Claims 8 and 15
Claim 8 depends on independent claim 1. Claim 15 depends on independent claim
9. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi,
Lingenfelder, Hall, Kazama, Caraviello, and Perng fails to teach or suggest all the
claimed limitations of independent claims 1 and 9.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang does not teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang. Additionally, the claims that depend on independent claims 1 and 9, namely claim 8, and claim 15, respectively, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 8 and 15 is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang does teach or suggest all the limitations of Applicant's independent claims 1 and 9, Applicant's independent claims 1 and 9 are not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang. Additionally, the claims that depend on independent claims 1 and 9, namely claim 8, and claim 15, respectively, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Caraviello, Perng, and Wang for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claims 8 and 15 is respectfully maintained.

Argument 8
Claim 18
Claim 18 depends on independent claim 16. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, and Kripalani fails to teach or suggest all the claimed limitations of independent claim 16.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Tung, and Liu does not teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is not obvious
over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama,
Figueroa, Kripalani, Tung, and Liu. Additionally, the claims that depend on independent
claim 16, namely claim 18, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Tung, and
Liu for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 18 is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Tung, and Liu does teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Tung, and Liu. Additionally, the claims that depend on independent claim 16, namely claim 18, are also not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Tung, and Liu for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 18 is respectfully maintained.

Argument 9
Claim 19
Claim 19 depends on independent claim 16. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muff at, Tung, Liu, Caraviello, and Perng fails to teach or suggest all the claimed limitations of independent claim 16.Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng does not teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng. Additionally, the claims that depend on independent claim 16, namely claim 19, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 19 is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng does teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng. Additionally, the claims that depend on independent claim 16, namely claim 19, are also not patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, and Perng for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 19 is respectfully maintained.

Argument 10
Claim 20
Claim 20 depends on independent claim 16. As asserted above, Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muff at, Tung, Liu, Caraviello, and Perng fails to teach or suggest all the claimed limitations of independent claim 16.
Since Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang does not teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is not obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang. Additionally, the claims that depend on independent claim 16, namely claim 20, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 20 is respectfully requested.

Examiner response:
Examiner respectfully disagrees. Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang does teach or suggest all the limitations of Applicant's independent claim 16, Applicant's independent claim 16 is obvious over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang. Additionally, the claims that depend on independent claim 16, namely claim 20, are also patentable over Stojanovic in view of Eaton, Conway, Datta, Furuichi, Lingenfelder, Hall, Kazama, Figueroa, Kripalani, Muffat, Tung, Liu, Caraviello, Perng, and Wang for at least the same reasons.
Accordingly, withdrawal of the 35 U.S.C. §103 rejection of claim 20 is respectfully maintained.



Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABABACAR SECK whose telephone number is (571)270-7146.  The examiner can normally be reached on Monday-Friday 8:00 A.M.-6:00 P.M.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 5712723719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABABACAR SECK/Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122