DETAILED ACTION
This office action is in response to Applicant’s arguments and amendments filed on April 22, 2022. The application contains claims 1-21: 
Claims 1-4, 8-11, and 15-18 are amended
Claims 1-21 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments and amendments filed on April 22, 2022 have been fully considered and the objections and rejections are updated accordingly. 

Claim Objections
In view of the amendments to the claims, the objections to claims 2-4, 9-11, and 16-18 are withdrawn.
However, the amendments raise new issues. Please see below for details.

Claim Rejections - 35 USC § 103
	Applicant’s arguments with respect to the new limitations introduced with the amendments are addressed with new citations along with rationale. 
Please refer to the updated 35 U.S.C. 103 rejections as set forth below for details.

Claim Objections
Claim 15 is objected to because of the following informalities:
Claim 15, line 3: there is an extra “a” in “a operations”
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 1, 8, and 15 each recite “and retraining/retain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior” as a claim limitation at the end of each respective claim. There is insufficient antecedent basis for the underlined portion in this limitation in the claim. Paragraph [0053], which Applicant provides as support for the amendments, does not offer antecedent basis for the underlined portion. Therefore, claims 1, 8, and 15 are indefinite and rejected under 35 U.S.C. 112(b). 
Dependent claims 2-7, 9-14, and 16-21 are also rejected for inheriting the deficiency from their corresponding independent claims 1, 8, and 15, respectively.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 8, 9, 15, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Roy et al. (US 10354201 B1), in view of Dherange et al. (US 20200382536 A1).

With regard to claim 1,
Roy teaches
a method of evaluating a first command line interface (CLI) input of a process executing on a computing system (Col. 2, lines 60-64: a method of clustering data. Fig. 1; Col. 6, lines 61-67; Col. 7, lines 1-8: command line tools correspond to “CLI”, source data indicated via command line tools corresponds to “CLI input”, the system shown in Fig. 1 corresponds to “a computing system” that executes the clustering process, i.e., “a process”), comprising: 
examining the first CLI input (Col. 7, lines 32-36: determining parameters or properties of the clustering methodology based on attributes of observation records indicates “examining” the observations records, i.e., the “CLI input”); 
selecting a first clustering model corresponding to the process based on examining the first CLI input (Col. 7, lines 32-36: clustering methodology corresponds to “a first clustering model”, and attributes of observation records indicates “based on examining the first CLI input” as discussed above), wherein the first clustering model is created based on a first clustering configuration and a first feature type combination identifying a first set of feature types of a plurality of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: select a clustering methodology based on the number of clusters and attributes of several different types may be taken into account in the algorithms, wherein the number of clusters corresponds to “a first clustering configuration” and different types of attributes in observation records, e.g., numerical, categorical or text attributes, etc., corresponds to “feature type combination”), wherein: 
the first clustering model corresponds to a data set created as a result of an application of a clustering algorithm with the first clustering configuration to a plurality of first feature combinations corresponding to a plurality of CLI inputs, each of the plurality of first feature combinations comprising a corresponding first set of features of the corresponding CLI input, each of the first set of features having a feature type of the first set of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: generalized K-means, generalized K-medians, generalized K-harmonic-means, etc. are examples of “a clustering algorithm”, the number of clusters corresponds to “the first clustering configuration” as discussed above, each attribute value as included in each observation record has a particular attribute type, e.g., numerical, categorical or text attributes, etc., i.e., “a feature type”); 
the first clustering configuration configures the clustering algorithm to cluster the data set into one or more clusters (Col. 7, lines 36-37: number of clusters); 
creating a first feature combination for the first CLI input based on the first feature type combination (Col. 7, lines 32-67; Col. 8, lines 1-9: distance metrics for different attributes of the observation records, weights to be assigned to the attributes, and normalization techniques to be applied to the different attributes all inherently teach extracting attributes of different types from observation records, i.e., “creating a first feature combination … based on the first feature type combination”); 
evaluating the first CLI input using the first clustering model and the first feature combination, wherein the evaluating further comprises determining a similarity score corresponding to a similarity between the first feature combination and the one or more clusters of the first clustering model (Fig. 7; Col. 13, line 56-67; Col. 14, lines 1-42: compute the multi-attribute distances for each observation record with respect to each cluster representative, wherein the multi-attribute distances correspond to “a similarity score”); 
Roy does not teach
determining whether or not the first CLI input corresponds to normal behavior based on the similarity score; and 
retraining the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior.
Dherange teaches
determining whether or not the first CLI input corresponds to normal behavior based on the similarity score ([0148]-[0157]: detect anomaly based on similarity scores); and 
retraining the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior ([0040]; [0049]: retrain the ML model using feedback from a human operator marking false positives/negatives, wherein the marking of false positives corresponds to “normal behavior”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy to incorporate the teachings of Dherange to determine whether or not the first CLI input corresponds to normal behavior based on the similarity score and retrain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior. Doing so would provide techniques that monitor a computer network or analyze data to detect anomaly in the usage or occurrence of a potentially fraudulent event and train the machine learning process about false positives/negatives based on human feedback and use the learning during future analysis of anomaly detection as taught by Dherange ([0003]; [0040]).

With regard to claim 2,
	As discussed in claim 1, Roy and Dherange teach all the limitations therein.
Roy further teaches
the method of claim 1, further comprising: 
creating the first clustering model for the process prior to the examining, wherein the creating further comprises: 
extracting, for each of the plurality of CLI inputs, a plurality of features; 
generating, for each of the plurality of CLI inputs, based on the corresponding extracted plurality of features, one or more feature combinations corresponding to one or more feature type combinations; 
applying the clustering algorithm to each of the one or more feature combinations of each of the plurality of CLI inputs with one or more clustering configurations to create a plurality of clustering models including the first clustering model, wherein for each application of the clustering algorithm: 
the clustering algorithm is applied with one of the one or more clustering configurations; and 
the clustering algorithm is applied to one of the one or more feature combinations for each of the plurality of CLI inputs; and 
selecting the first clustering model from the plurality of clustering models (Col. 5, lines 4-67: the current version of the clustering model indicates the clustering model is created prior to the examining. The remaining limitations are similar to what have been discussed in the parent claim).

With regard to claim 8,
Roy teaches
an apparatus (Fig. 1), comprising: 
a non-transitory memory comprising executable instructions (Fig. 12, System memory 9020); and 
a processor in data communication with the non-transitory memory (Fig. 12, Possessor 9010a-9010n) and configured to execute the instructions to cause the apparatus to: 
examine a first command line interface (CLI) input of a process (Col. 7, lines 32-36: determining parameters or properties of the clustering methodology based on attributes of observation records indicates “examining” the observations records, i.e., the “CLI input”); 
select a first clustering model corresponding to the process based on examining the first CLI input (Col. 7, lines 32-36: clustering methodology corresponds to “a first clustering model”, and attributes of observation records indicates “based on examining the first CLI input” as discussed above), wherein the first clustering model is created based on a first clustering configuration and a first feature type combination identifying a first set of feature types of a plurality of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: select a clustering methodology based on the number of clusters and attributes of several different types may be taken into account in the algorithms, wherein the number of clusters corresponds to “a first clustering configuration” and different types of attributes in observation records, e.g., numerical, categorical or text attributes, etc., corresponds to “feature type combination”), wherein: 
the first clustering model corresponds to a data set created as a result of an application of a clustering algorithm with the first clustering configuration to a plurality of first feature combinations corresponding to a plurality of CLI inputs, each of the plurality of first feature combinations comprising a corresponding first set of features of the corresponding CLI input, each of the first set of features having a feature type of the first set of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: generalized K-means, generalized K-medians, generalized K-harmonic-means, etc. are examples of “a clustering algorithm”, the number of clusters corresponds to “the first clustering configuration” as discussed above, each attribute value as included in each observation record has a particular attribute type, e.g., numerical, categorical or text attributes, etc., i.e., “a feature type”); 
the first clustering configuration configures the clustering algorithm to cluster the data set into one or more clusters (Col. 7, lines 36-37: number of clusters); 
create a first feature combination for the first CLI input based on the first feature type combination (Col. 7, lines 32-67; Col. 8, lines 1-9: distance metrics for different attributes of the observation records, weights to be assigned to the attributes, and normalization techniques to be applied to the different attributes all inherently teach extracting attributes of different types from observation records, i.e., “creating a first feature combination … based on the first feature type combination”); 
evaluate the first CLI input using the first clustering model and the first feature combination, wherein the processor being configured to cause the apparatus to evaluate the first CLI input further comprises the processor being configured to cause the apparatus to determine a similarity score corresponding to a similarity between the first feature combination and the one or more clusters of the first clustering model (Fig. 7; Col. 13, line 56-67; Col. 14, lines 1-42: compute the multi-attribute distances for each observation record with respect to each cluster representative, wherein the multi-attribute distances correspond to “a similarity score”); 
Roy does not teach
determine whether or not the first CLI input corresponds to normal behavior based on the similarity score; and 
retrain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior.
Dherange teaches
determine whether or not the first CLI input corresponds to normal behavior based on the similarity score ([0148]-[0157]: detect anomaly based on similarity scores); and
retrain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior ([0040]; [0049]: retrain the ML model using feedback from a human operator marking false positives/negatives, wherein the marking of false positives corresponds to “normal behavior”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy to incorporate the teachings of Dherange to determine whether or not the first CLI input corresponds to normal behavior based on the similarity score and retrain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior. Doing so would provide techniques that monitor a computer network or analyze data to detect anomaly in the usage or occurrence of a potentially fraudulent event and train the machine learning process about false positives/negatives based on human feedback and use the learning during future analysis of anomaly detection as taught by Dherange ([0003]; [0040]).

With regard to claim 9,
	As discussed in claim 8, Roy and Dherange teach all the limitations therein.
Roy further teaches
the apparatus of claim 8, wherein the processor is further configured to cause the apparatus to: 
create the first clustering model for the process prior to the examining, wherein the processor being configured to cause the apparatus to create further comprises the processor being configured to cause the apparatus to: 
extract, for each of the plurality of CLI inputs, a plurality of features; 
generate, for each of the plurality of CLI inputs, based on the corresponding extracted plurality of features, one or more feature combinations corresponding to one or more feature type combinations; 
apply the clustering algorithm to each of the one or more feature combinations of each of the plurality of CLI inputs with one or more clustering configurations to create a plurality of clustering models including the first clustering model, wherein for each application of the clustering algorithm: 
the clustering algorithm is applied with one of the one or more clustering configurations; and 
the clustering algorithm is applied to one of the one or more feature combinations for each of the plurality of CLI inputs; and 
select the first clustering model from the plurality of clustering models (Col. 5, lines 4-67: the current version of the clustering model indicates the clustering model is created prior to the examining. The remaining limitations are similar to what have been discussed in the parent claim).

With regard to claim 15,
Roy teaches
a non-transitory computer readable medium having instructions stored thereon that, when executed by a computing system (the system shown in Fig. 1 corresponds to “a computing system”), cause the computing system to perform a operations comprising: 
examining a first command line interface (CLI) input of a process (Col. 7, lines 32-36: determining parameters or properties of the clustering methodology based on attributes of observation records indicates “examining” the observations records, i.e., the “CLI input”); 
selecting a first clustering model corresponding to the process based on examining the first CLI input (Col. 7, lines 32-36: clustering methodology corresponds to “a first clustering model”, and attributes of observation records indicates “based on examining the first CLI input” as discussed above), wherein the first clustering model is created based on a first clustering configuration and a first feature type combination identifying a first set of feature types of a plurality of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: select a clustering methodology based on the number of clusters and attributes of several different types may be taken into account in the algorithms, wherein the number of clusters corresponds to “a first clustering configuration” and different types of attributes in observation records, e.g., numerical, categorical or text attributes, etc., corresponds to “feature type combination”), wherein: 
the first clustering model corresponds to a data set created as a result of an application of a clustering algorithm with the first clustering configuration to a plurality of first feature combinations corresponding to a plurality of CLI inputs, each of the plurality of first feature combinations comprising a corresponding first set of features of the corresponding CLI input, each of the first set of features having a feature type of the first set of feature types (Col. 7, lines 32-67; Col. 8, lines 1-9: generalized K-means, generalized K-medians, generalized K-harmonic-means, etc. are examples of “a clustering algorithm”, the number of clusters corresponds to “the first clustering configuration” as discussed above, each attribute value as included in each observation record has a particular attribute type, e.g., numerical, categorical or text attributes, etc., i.e., “a feature type”); 
the first clustering configuration configures the clustering algorithm to cluster the data set into one or more clusters (Col. 7, lines 36-37: number of clusters); 
creating a first feature combination for the first CLI input based on the first feature type combination (Col. 7, lines 32-67; Col. 8, lines 1-9: distance metrics for different attributes of the observation records, weights to be assigned to the attributes, and normalization techniques to be applied to the different attributes all inherently teach extracting attributes of different types from observation records, i.e., “creating a first feature combination … based on the first feature type combination”); 
evaluating the first CLI input using the first clustering model and the first feature combination, wherein the evaluating further comprises determining a similarity score corresponding to a similarity between the first feature combination and the one or more clusters of the first clustering model (Fig. 7; Col. 13, line 56-67; Col. 14, lines 1-42: compute the multi-attribute distances for each observation record with respect to each cluster representative, wherein the multi-attribute distances correspond to “a similarity score”); 
Roy does not teach
determining whether or not the first CLI input corresponds to normal behavior based on the similarity score; and
retraining the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior.
Dherange teaches
determining whether or not the first CLI input corresponds to normal behavior based on the similarity score ([0148]-[0157]: detect anomaly based on similarity scores); and
retraining the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior ([0040]; [0049]: retrain the ML model using feedback from a human operator marking false positives/negatives, wherein the marking of false positives corresponds to “normal behavior”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy to incorporate the teachings of Dherange to determine whether or not the first CLI input corresponds to normal behavior based on the similarity score and retrain the first clustering model using the first CLI input only when the first CLI input corresponds to the normal behavior. Doing so would provide techniques that monitor a computer network or analyze data to detect anomaly in the usage or occurrence of a potentially fraudulent event and train the machine learning process about false positives/negatives based on human feedback and use the learning during future analysis of anomaly detection as taught by Dherange ([0003]; [0040]).

With regard to claim 16,
	As discussed in claim 15, Roy and Dherange teach all the limitations therein.
Roy further teaches
the non-transitory computer readable medium of claim 15, wherein the operations further comprise: 
creating the first clustering model for the process prior to the examining, wherein the creating further comprises: 
extracting, for each of the plurality of CLI inputs, a plurality of features; 
generating, for each of the plurality of CLI inputs, based on the corresponding extracted plurality of features, one or more feature combinations corresponding to one or more feature type combinations; 
applying the clustering algorithm to each of the one or more feature combinations of each of the plurality of CLI inputs with one or more clustering configurations to create a plurality of clustering models including the first clustering model, wherein for each application of the clustering algorithm: 
the clustering algorithm is applied with one of the one or more clustering configurations; and 
the clustering algorithm is applied to one of the one or more feature combinations for each of the plurality of CLI inputs; and 
selecting the first clustering model from the plurality of clustering models (Col. 5, lines 4-67: the current version of the clustering model indicates the clustering model is created prior to the examining. The remaining limitations are similar to what have been discussed in the parent claim).

Claims 3-5, 10-12, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Roy et al. (US 10354201 B1), in view of Dherange et al. (US 20200382536 A1), and in further view of LOPEZ DE PRADO (US 20190294990 A1).

With regard to claim 3,
As discussed in claim 2, Roy and Dherange teach all the limitations therein.
Roy further teaches
the method of claim 2, wherein selecting the first clustering model comprises: 
identifying one or more clustering models from the plurality of clustering models having average scores above a threshold (Col. 5, lines 4-36: the fraction or number of assignment changes made during an iteration falls below a threshold, or a cost function evaluated for the clustering model reaches a threshold is equivalent to “average scores above a threshold”); 
Roy and Dherange do not teach
the method of claim 2, wherein selecting the first clustering model comprises: 
performing analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generating a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of a average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
selecting the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models.
LOPEZ DE PRADO teaches
the method of claim 2, wherein selecting the first clustering model comprises: 
performing analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generating a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of a average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
selecting the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models ([0139]-[0143]: compute the quality score q as the average silhouette score across the k clusters of each clustering scheme divided by the standard deviation of the predicted outcomes, and select the clustering scheme having the highest quality score q).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of LOPEZ DE PRADO to generate a ratio for each clustering model by dividing an average score of the clustering model by a standard deviation of the corresponding clustering model and select the clustering model having the highest ratio. Doing so would select a clustering scheme that achieves a highest quality score among a plurality of quality scores computed for the plurality of clustering schemes as taught by LOPEZ DE PRADO ([0017]).

With regard to claim 4,
As discussed in claim 3, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
the method of claim 3, wherein the first clustering model has a highest ratio among all the identified one or more clustering models ([0143]: select the clustering scheme having the highest quality score q).

With regard to claim 5,
As discussed in claim 3, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
the method of claim 3, wherein the analysis comprises a silhouette analysis ([0142]).

With regard to claim 10,
As discussed in claim 9, Roy and Dherange teach all the limitations therein.
Roy further teaches
the apparatus of claim 9, wherein the processor being configured to cause the apparatus to select the first clustering model further comprises the processor being configured to cause the apparatus to: 
identify one or more clustering models from the plurality of clustering models having average scores above a threshold (Col. 5, lines 4-36: the fraction or number of assignment changes made during an iteration falls below a threshold, or a cost function evaluated for the clustering model reaches a threshold is equivalent to “average scores above a threshold”); 
Roy and Dherange do not teach
the apparatus of claim 9, wherein the processor being configured to cause the apparatus to select the first clustering model further comprises the processor being configured to cause the apparatus to: 
perform analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generate a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of an average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
select the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models.
LOPEZ DE PRADO teaches
the apparatus of claim 9, wherein the processor being configured to cause the apparatus to select the first clustering model further comprises the processor being configured to cause the apparatus to: 
perform analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generate a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of an average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
select the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models ([0139]-[0143]: compute the quality score q as the average silhouette score across the k clusters of each clustering scheme divided by the standard deviation of the predicted outcomes, and select the clustering scheme having the highest quality score q).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of LOPEZ DE PRADO to generate a ratio for each clustering model by dividing an average score of the clustering model by a standard deviation of the corresponding clustering model and select the clustering model having the highest ratio. Doing so would select a clustering scheme that achieves a highest quality score among a plurality of quality scores computed for the plurality of clustering schemes as taught by LOPEZ DE PRADO ([0017]).

With regard to claim 11,
As discussed in claim 10, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
apparatus of claim 10, wherein the first clustering model has a highest ratio among all the identified one or more clustering models ([0143]: select the clustering scheme having the highest quality score q).

With regard to claim 12,
As discussed in claim 10, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
apparatus of claim 10, wherein the analysis comprises a silhouette analysis ([0142]).

With regard to claim 17,
As discussed in claim 16, Roy and Dherange teach all the limitations therein.
Roy further teaches
the non-transitory computer readable medium of claim 16, wherein selecting the first clustering model comprises: 
identifying one or more clustering models from the plurality of clustering models having average scores above a threshold (Col. 5, lines 4-36: the fraction or number of assignment changes made during an iteration falls below a threshold, or a cost function evaluated for the clustering model reaches a threshold is equivalent to “average scores above a threshold”); 
Roy and Dherange do not teach
the non-transitory computer readable medium of claim 16, wherein selecting the first clustering model comprises: 
performing analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generating a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of a average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
selecting the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models.
LOPEZ DE PRADO teaches
the non-transitory computer readable medium of claim 16, wherein selecting the first clustering model comprises: 
performing analysis for each of the plurality of clustering models to generate an average score and a standard deviation for each of the plurality of clustering models; 
generating a ratio for each of the identified one or more clustering models, the ratio corresponding to a ratio of a average score of a corresponding clustering model to a standard deviation of the corresponding clustering model; and 
selecting the first clustering model from the identified one or more clustering models based on the ratios of the identified one or more clustering models ([0139]-[0143]: compute the quality score q as the average silhouette score across the k clusters of each clustering scheme divided by the standard deviation of the predicted outcomes, and select the clustering scheme having the highest quality score q).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of LOPEZ DE PRADO to generate a ratio for each clustering model by dividing an average score of the clustering model by a standard deviation of the corresponding clustering model and select the clustering model having the highest ratio. Doing so would select a clustering scheme that achieves a highest quality score among a plurality of quality scores computed for the plurality of clustering schemes as taught by LOPEZ DE PRADO ([0017]).

With regard to claim 18,
As discussed in claim 17, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
non-transitory computer readable medium of claim 17, wherein the first clustering model has a highest ratio among all the identified one or more clustering models ([0143]: select the clustering scheme having the highest quality score q).

With regard to claim 19,
As discussed in claim 17, Roy and Dherange and LOPEZ DE PRADO teach all the limitations therein.
LOPEZ DE PRADO further teaches
the non-transitory computer readable medium of claim 17, wherein the analysis comprises a silhouette analysis ([0142]).

Claims 6, 7, 13, 14, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Roy et al. (US 10354201 B1), in view of Dherange et al. (US 20200382536 A1), and in further view of Levy et al. (US 20190370347 A1).

With regard to claim 6,
	As discussed in claim 2, Roy and Dherange teach all the limitations therein.
Roy and Dherange do not teach
the method of claim 2, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input.
Levy teaches
the method of claim 2, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input ([0148]-[0157]: representative string pattern, which represents a common string pattern shared by all training log messages associated with the one or more clusters, corresponds to “a categorical feature”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of Levy to extract a categorical feature corresponding to a pattern of the corresponding CLI input. Doing so would facilitate calculating a string distance between a textual content of the respective CLI input and a representative string pattern of each of the plurality of clusters as taught by Levy ([0009]).

With regard to claim 7,
	As discussed in claim 6, Roy and Dherange and Levy teach all the limitations therein.
Levy further teaches
the method of claim 6, wherein: 
the pattern of the corresponding CLI input corresponds to a number and type of parameters in the corresponding CLI input; and the categorical feature represents the pattern of the corresponding CLI input ([0148]-[0157]: number and type of parameters, e.g., constant tokens, variable fields. The representative string pattern corresponds to a common string pattern shared by all training log messages associated with the one or more clusters).

With regard to claim 13,
	As discussed in claim 9, Roy and Dherange teach all the limitations therein.
Roy and Dherange do not teach
the apparatus of claim 9, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input.
Levy teaches
the apparatus of claim 9, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input ([0148]-[0157]: representative string pattern, which represents a common string pattern shared by all training log messages associated with the one or more clusters, corresponds to “a categorical feature”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of Levy to extract a categorical feature corresponding to a pattern of the corresponding CLI input. Doing so would facilitate calculating a string distance between a textual content of the respective CLI input and a representative string pattern of each of the plurality of clusters as taught by Levy ([0009]).

With regard to claim 14,
	As discussed in claim 13, Roy and Dherange and Levy teach all the limitations therein.
Levy further teaches
the apparatus of claim 13, wherein: 
the pattern of the corresponding CLI input corresponds to a number and type of parameters in the corresponding CLI input; and the categorical feature represents the pattern of the corresponding CLI input ([0148]-[0157]: number and type of parameters, e.g., constant tokens, variable fields. The representative string pattern corresponds to a common string pattern shared by all training log messages associated with the one or more clusters).

With regard to claim 20,
	As discussed in claim 16, Roy and Dherange teach all the limitations therein.
Roy and Dherange do not teach
the non-transitory computer readable medium of claim 16, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input.
Levy teaches
the non-transitory computer readable medium of claim 16, wherein extracting the plurality of features from each of the plurality of CLI inputs comprises extracting a categorical feature corresponding to a pattern of the corresponding CLI input ([0148]-[0157]: representative string pattern, which represents a common string pattern shared by all training log messages associated with the one or more clusters, corresponds to “a categorical feature”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Roy and Dherange to incorporate the teachings of Levy to extract a categorical feature corresponding to a pattern of the corresponding CLI input. Doing so would facilitate calculating a string distance between a textual content of the respective CLI input and a representative string pattern of each of the plurality of clusters as taught by Levy ([0009]).

With regard to claim 21,
	As discussed in claim 20, Roy and Dherange and Levy teach all the limitations therein.
Levy further teaches
the non-transitory computer readable medium of claim 20, wherein: 
the pattern of the corresponding CLI input corresponds to a number and type of parameters in the corresponding CLI input; and the categorical feature represents the pattern of the corresponding CLI input ([0148]-[0157]: number and type of parameters, e.g., constant tokens, variable fields. The representative string pattern corresponds to a common string pattern shared by all training log messages associated with the one or more clusters).

Examiner’s Note
Examiner has pointed out particular references contained in the prior arts of record in the body of this action for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and Figures may apply as well. It is respectfully requested from the applicant, in preparing the response, to consider fully the entire references as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior arts or disclosed by the examiner. It is noted that any citation to specific pages, columns, figures, or lines in the prior art references any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331-33, 216 USPQ 1038-39 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA1968)).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIAOQIN HU whose telephone number is (571)272-1792.  The examiner can normally be reached on Monday-Friday 7:00am-3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/XIAOQIN HU/Examiner, Art Unit 2168            

/IRETE F EHICHIOYA/Supervisory Patent Examiner, Art Unit 2168