DETAILED ACTION
	This action is in response to the remarks submitted on 7/28/2021. Claims 1, 2, 4, and 6 have been amended. Claims 1-7 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendments necessitated new grounds of rejection. 

Response to Arguments
The Applicant’s arguments regarding the rejection of above-mentioned claims have been fully considered.
In reference to Applicant’s arguments about:
Claims Objections.
Examiner’s response:
           	Objections withdrawn in response to amendments.
In reference to Applicant’s arguments about:
	-Specification Objectiones
Examiner’s response:
	Objections withdrawn in response to amendments.
In reference to Applicant’s arguments about:
Rejection under 35 USC §112 (b).
Examiner’s response:
Rejections withdrawn in response to amendments.
In reference to Applicant’s arguments about:
Double Patenting.
Examiner’s response:
            Rejection upheld.
In reference to Applicant’s arguments about:
Rejections under 35 USC §101.
Examiner’s response:
The examiner respectfully disagrees. 
In regards to argument:
“the present claims, which relate to a complex process including generating feature vectors and classification rules, is far more than a simple mental process. Applicant therefore respectfully submits that the claims are not directed to an abstract idea”,
The Examiner holds the view that generating rules and vectors are routine mental operations.
“the amended claims…are eligible because they "reflect[] an improvement in the functioning of a computer, or an improvement to other technology or technical field," and further "integrate" the alleged "judicial exception into a practical application of the exception.”,
The Examiner holds the view that the use of a machine learning algorithm, as claimed, is generic, and not sufficient in and of itself to represent a clear technological improvement over existing technology, nor to integrate the claimed invention into a practical application.
“The claim elements, taken as an ordered combination, are inventive steps that amount to significantly more than a patent upon an abstract idea.”
The Examiner holds the view that the claim elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
For these above reasons, rejections are still maintained.

In reference to Applicant’s arguments about:
Rejections under 35 USC §103.
Examiner’s response:
            Applicants’ arguments have been considered, but they are directed to the newly added limitations to independent claims. These amendments necessitated new grounds of rejection; therefore, arguments are moot in view of the new grounds of rejection.




Double Patenting

 A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.
Claims 1-7 are provisionally rejected under 35 U.S.C. 101 as claiming the same invention as that of claims 8-14 of copending Application No. 15/820,117. This is a provisional statutory double patenting rejection since the claims directed to the same invention have not in fact been patented.


INSTANT APPLICATION
[A method comprising:] receiving a plurality of assets from a data catalog and a [respective] plurality of classifications applied to the plurality of assets in the data catalog;
Application 15/820,117
receiving a plurality of assets from a data catalog and a plurality of classifications applied to the plurality of assets in the data catalog; 

extracting, for a plurality of features, feature data from the plurality of assets and the plurality of asset classifications;
generating a feature vector based on the extracted feature data; 
generating a feature vector based on the extracted feature data; 
generating, by a machine learning (ML) algorithm and based on the feature vector, a first classification rule specifying a condition for applying a first classification of the plurality of classifications to a first asset of the plurality of assets;
generating, by a machine learning (ML) algorithm and based on the feature vector, a first classification rule specifying a condition for applying a first classification of the plurality of classifications to a first asset of the plurality of assets;
identifying a second classification rule;
identifying a second classification rule;
determining a number of terms that match and are present in both the first classification rule and the second classification rule;
determining a number of terms that match and are present in both the first classification rule and the second classification rule;
upon determining that the number of terms that match exceeds a threshold, determining whether differences between the first and second classification rules are significant based on determining whether the differences only include use of different data types; and
upon determining that the number of terms that match exceeds a threshold, determining whether differences between the first and second classification rules are significant based on determining whether the differences only include use of different data types; and
upon determining that the differences between the first and second classification rules are not 

Claim 2
Claim 9
determining that a second classification of the plurality of classifications applied to the first asset was applied to the first asset by a user;
determining that a second classification of the plurality of classifications applied to the first asset was applied to the first asset by a user;
identifying a third classification rule generated by the ML algorithm; 

	
identifying a third classification rule generated by the ML algorithm;
determining that a number of terms that match and are present in both the first classification rule and the third classification rule exceeds a threshold;
determining that a number of terms that match and are present in both the first classification rule and the third classification rule exceeds a threshold;
outputting the first and third classification rules to the user with an indication suggesting to replace the third classification rule with the first classification rule.
outputting the first and third classification rules to the user with an indication suggesting to replace the third classification rule with the first classification rule.	
Claim 3
Claim 10
storing the first classification rule
storing the first classification rule
determining a new asset has been added to the data catalog
determining a new asset has been added to the data catalog

determining that the new asset satisfies the condition specified in the first classification rule
programmatically applying the first classification to the new asset
programmatically applying the first classification to the new asset
Claim 4
Claim 11
determining that a second classification of the plurality of classifications was programmatically applied to the first asset based on a third classification rule generated by the ML algorithm
determining that a second classification of the plurality of classifications was programmatically applied to the first asset based on a third classification rule generated by the ML algorithm
determining that a number of terms that match and are present in the first classification rule and the third classification rule exceeds a threshold
determining that a number of matching terms present in both the first classification rule and the third classification rule exceeds a threshold
outputting the first and third classification rules to a user with an indication suggesting to replace the third classification rule with the first classification rule.
outputting the first and third classification rules to a user with an indication specifying suggesting to replace the third classification rule with the first classification rule.
Claim 5
Claim 12
wherein the ML algorithm comprises one of: (i) a decision tree based classifier, (ii) a support vector machine, and (iii) an artificial neural 

Claim 6
Claim 13
wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a data format of each of the plurality of assets, (iv) a relationship between two or more of the plurality of classifications, (v) a project to which each of the plurality of assets belong, (vi) a data quality score computed for each of the plurality of assets, (vii) a set of tags applied to each of the plurality of assets, (viii) a name of each of the plurality of assets, (ix), a textual description of each of the plurality of assets, and (x) a group of assets comprising a subset of the plurality of assets.
wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a data format of each of the plurality of assets, (iv) a relationship between two or more of the plurality of classifications, (v) a project to which each of the plurality of assets belong, (vi) a data quality score computed for each of the plurality of assets, (vii) a set of tags applied to each of the plurality of assets, (viii) a name of each of the plurality of assets, (ix), a textual description of each of the plurality of assets, and (x) a group of assets comprising a subset of the plurality of assets.
Claim 7
Claim 14
wherein the plurality of assets comprise: (i) a database, (ii) files, (iii) columns in the database, and (iv) a table in the database.
wherein the plurality of assets comprise: (i) a database, (ii) files, (iii) columns in the database, and (iv) a table in the database.


	
Claim Rejections - 35 USC § 101

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7(s) are rejected under 35 U.S.C. §101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1 Analysis: In the instant case, the claims are directed to a method. Thus, each of the claims falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).

Step 2A Analysis: Based on the claims being determined to be within the four statutory categories, it must be determined if the claims are directed to a judicial exception (i.e., law of nature, natural phenomenon, or abstract idea). In this case, the claims fall within the judicial exception of abstract idea. Specifically, the abstract idea of “Mental processes (including an observation, evaluation, judgement, or opinion)”.

Step 2A Prong 1: Claim 1 recites:

“extracting, for a plurality of feature…” (evaluation/judgement)
“generating a feature vector…” (evaluation)
“generating... based on the feature vector…” (evaluation)
“identifying a second…” (judgement)
“determining a number…” (observation)
“upon determining that the number of terms… exceeds a threshold…determining whether differences…are significant…” (judgement)
“upon determining… refraining…” (judgement)
Step 2A Prong 2: This judicial exception is not integrated into a practical application because the additional elements in claim 1 “by a machine learning algorithm” correspond to mere instructions to implement an abstract idea or other exception on a computer. “Receiving a plurality of assets from a data catalog and a respective plurality of classifications applied to each asset in the data catalog” is merely an insignificant extra-solution activity to the judicial exception. Accordingly, these additional element do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “receiving a plurality of assets from a data catalog and a respective plurality of classifications applied to each asset in the data catalog” is known by the courts to be well understood, routine, and conventional. The limitation is directed to the well understood, routine, and conventional computer function storing and retrieving information in memory (MPEP 2106.05(d)(II)).

Step 2A Prong 1: Claim 2 recites:
“determining that a second classification..(observation)
“identifying a third...” (judgement)
“determining that a number...” (judgement)
Step 2A: Prong 2 analysis: Claim 2 includes the limitation “outputting the first and second classification rule to the user with an indication specifying to replace the second classification with the first classification rule”, which is merely an insignificant extra-solution activity to the judicial exception. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “outputting the first and second classification rule to the user with an indication specifying to replace the second classification with the first classification rule” is known by the courts to be well understood, routine, and conventional. The limitation is directed to the well understood, routine, and conventional computer function receiving or transmitting data over a network, more specifically, sending messages over a network. (MPEP 2106.05(d)(II)(i)).

Step 2A, Prong 1 analysis: Claim 3 recites:
“determining a new asset...” (observation)
“determining that the new asset...” (judgement)
“programmatically applying...” (evaluation)
Step 2A: Prong 2 analysis: Claim 3 includes the limitation “storing the first classification rule” which is merely insignificant extra-solution activity to the judicial exception. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “storing the first classification rule” is known by the courts to be well understood, routine, and conventional. The limitation “storing the first classification rule” is directed to the well understood, routine, and conventional computer function storing and retrieving information in memory (MPEP 2106.05(d)(ll)(iv)).

Step 2A, Prong 1 analysis: Claim 4 recites:
“determining that a second classification..” (observation)
“determining that a number...” (judgement)
Step 2A: Prong 2 analysis: Claim 4 includes the limitation “outputting the first and third classification rule to the user with an indication suggesting to replace the third classification with the first classification rule”, which is merely an insignificant extra-solution activity to the judicial exception. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “outputting the first and third classification rule to the user with an indication suggesting to replace the second classification with the first classification rule” is known by the courts to be well understood, routine, and conventional. The limitation is directed to the well understood, routine, and conventional computer function receiving or transmitting data over a network, more specifically, sending messages over a network. (MPEP 2106.05(d)(ll)(i)).

Step 2A, Prong 1 analysis: Claim 5 recites:
“wherein the ML algorithm...” (evaluation)
Step 2A: Prong 2 analysis: Claim 5 does not incorporate any further limitations that are not directed to an abstract idea. This judicial exception is not integrated into a practical application because the additional element in claim 5 “wherein the ML algorithm comprises one of: (i) a decision tree based classifier, (ii) a support vector machine, and (iii) an artificial neural network, wherein the ML algorithm generates the feature vector”, corresponds to mere instructions to implement an abstract idea or other exception on a computer. The claims are directed to an abstract idea.

Step 2A: Prong 1 analysis: Claim 6 recites:
“wherein the plurality of features comprise...” (extra-solution activity)
Step 2A: Prong 2 analysis: Claim 13 includes the limitation “wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a data format of each of the plurality of assets, (iii) a relationship between two or more of the plurality of classifications, (iv) a project to which each of the plurality of assets belong, (v) a data quality score computed for each of the plurality of assets, (vi) a set of tags applied to each of the plurality of assets, (vii) a name of each of the plurality of assets, (viii), a textual description of each of the plurality of assets, and (ix) a group of assets comprising a subset of the plurality of assets”, which is merely a part of the insignificant extra-solution activity to the judicial exception in claim 1. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
Step 2B analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a 

Step 2A: Prong 1 analysis: Claim 7 recites:
“wherein the plurality of assets comprise...” (extra-solution activity)
Step 2A: Prong 2 analysis: Claim 7 includes the limitation “wherein the plurality of assets comprise: (i) a database, (ii) files, (iii) columns in the database, and (iv) a table in the database”, which is merely a part of the insignificant extra-solution activity to the judicial exception in claim 8. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The limitation “wherein the plurality of assets comprise: (i) a database, (ii) files, (iii) columns in the database, and (iv) a table in the database” is known by the courts to be well understood, routine, and conventional. The limitation is directed to the well understood, routine, and conventional computer function storing and retrieving information in memory (MPEP 2106.05(d)(ll)(iv)).





Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.


Claims 1-5 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 20200067861 A1 to Leddy et al, (hereinafter, “Leddy”), in view of U.S. Pub. No. 20170250915 A1 to Long et al., (hereinafter, “Long”), further in view of U.S. Pub. No. 7788652 B2 to Plesko et Tariti (hereinafter, “Plesko”).

As per claim 1, Leddy teaches receiving a plurality of assets from a data catalog and a respective plurality of classifications applied to the plurality of assets in the data catalog ([1662] “This example process implements machine learning classifiers, such as: Logistic regression, random forest, SVM and naïve Bayes, respectively. The processes receive as input TFIDF training and test sets along with their labels as inputs and outputs predicted labels upon the given test set. It first trains the classifier using the given test set and its label, and then predicts the labels of the test set. Once the classifiers have been tested, they are connected to classify the input messages pipelined to the applicable ML classifiers.” [1233]” In one embodiment, the system is trained with one or more datasets containing scam messages, pruned with datasets containing one or more ham messages, and tested with datasets containing one or more scam messages. Here, training corresponds to adding signatures generated from vectors representing phrases that occur in scam messages. These signatures are added to a data structure in memory, or to a database, and pruning corresponds to removing signatures from this data structure or database if they occur in a ham message.” [1234] “As described in more detail below, a dataset is a collection of messages.” [1235] “In another embodiment, no ham messages are used for pruning, but instead a database representing ham messages is used. The COCA database, described in more detail below, is one ;
extracting, for a plurality of features, feature data from the plurality of assets and the plurality of asset classifications ([1658] “The n-gram feature extraction, in one embodiment, obtains text and the value n as inputs and outputs a list of n-gram tokens. In some embodiments, it uses “nitk” and “sklearn” libraries (see more detail below) to stem the text or remove stop-words from the text.” [1659] “The TFIDF feature extraction, in one embodiment, obtains text files of a particular data format (referred to herein as the “ZapFraud” data format) and outputs TFIDF feature vectors. It first divides the given data into train and test datasets, and then builds n-gram feature list of train and test sets using n-gram feature extractor described above. Then it builds TFIDF feature vectors from n-gram feature vectors of train set. For TF weighting, it uses “double normalization K” scheme with the K value of 0.4.” Examiner Note: The examiner sees Leddy’s training dataset as a plurality of assets and plurality of asset classifications.);
generating a feature vector based on the extracted feature data ([1658] “The n-gram feature extraction, in one embodiment, obtains text and the value n as inputs and outputs a list of n-gram tokens. In some embodiments, it uses “nitk” and “sklearn” libraries (see more detail below) to ; and
generating, by a machine learning (ML) algorithm and based on the feature vector, a first classification rule specifying a condition for applying a first classification of the plurality of classifications to a first asset of the plurality of assets ([1914] “The classification performed at 4204 can be performed using a variety of techniques. For example, a collection of terms can be evaluated using a rule-based approach (e.g., testing for the presence of words, and/or applying a threshold number of words whose presence are needed for a match to be found); using a support vector machine, where the elements of the support vector corresponds to terms or words; and/or using general artificial intelligence methods, such as neural networks, wherein nodes correspond to terms or words, and wherein the values associated with connectors cause an output corresponding essentially to a rule-based method. In each of the aforementioned embodiments, a value associated with the severity of the collection of terms being identified can be generated and output, where multiple values are generated if multiple collections of terms have been identified.”).
identifying a second classification rule ([1914] “The classification performed at 4204 can be performed using a variety of techniques. For example, a collection of terms can be evaluated using a rule-based approach (e.g., testing for the presence of words, and/or applying a threshold number of words whose presence are needed for a match to be found); using a support vector machine, where the elements of the support vector corresponds to terms or words; and/or using general artificial intelligence methods, such as neural networks, wherein nodes correspond to terms ;


Long teaches determining that a number of terms present in the first classification rule and the second classification rule exceeds a threshold ([0183] “The communication rules creation module 390 may optionally simplify the set of generated rules to remove duplicate rules and specific rules obviated by general rules. The communication rules creation module 390 may also optionally group rules into rule sets corresponding to application groupings identified, e.g., by the application grouping module 727.” Examiner Note: Determining that a rule is a duplicate is seen as ; and
outputting the first and second classification rules to the user with an indication specifying to replace the second classification rule with the first classification rule ([0163] “The rule simplification module 738 takes as input a set of rules generated by the rule generation module 736 and removes rules obviated by other rules in the set. The rule simplification module 738 removes specific rules from the set that are obviated by one or more general rules in the set. A specific rule is obviated by a general rule if all communication authorized by the specific rule is also authorized by the general rule. For example, the rule simplification module 738 identifies (a) a general rule with a scope portion specifying a set of label values and (b) a specific rule with a scope portion specifying the same set of label values as the general rule as well as additional label values for additional label dimensions. In the example, the general rule covers all the label values for the additional dimensions, so the general rule obviates the specific rule. Simplifying the set of rules facilitates review and revision by an administrator. Alternatively, the rule generation module 736 checks whether a connection is authorized by a generated rule before generating a rule to authorize the connection to improve computational efficiency. In some embodiments, the rule simplification module 738 sends recommendations of proposed simplifications to an administrator through the rule creation interface 740 rather than performing rule simplification automatically. In some instances, the rule simplification module 738 receives a request from administrators to use un-simplified rules, for example, by managing different specific rules that could be obviated by general rule. As another example, the rule simplification module 738 may increase the granularity of the rules based on requests from an administrator (e.g., requests to specify additional labels or rule conditions).”).
determining a number of terms that match and are present in both the first classification rule and the second classification rule ([0183] “The communication rules creation module 390 may optionally simplify the set of generated rules to remove duplicate rules and specific rules obviated by general rules. The communication rules creation module 390 may also optionally group rules into rule sets corresponding to application groupings identified, e.g., by the application grouping module 727.” Examiner Note: Leddy teaches the counting of matching terms (See Leddy 1922 below) but does not teach applying the counting of matching terms to detect duplicate rules. Long teaches checking for duplicate rules. When Long is applied to Leddy, the resulting system would determine the number of terms that match and are present in two rules as a part of its duplicate rule detection. 
Leddy [1922] “In some embodiments, each of the non-equivalent terms in a collection of terms (e.g., “long lost” and “huge sum”) are associated with one or more pointers, and ordered alphabetically. The number of pointers associated with each term is the same as the number of rules for which that term is used. Each rule is represented as a vector of Boolean values, where the vector has the same length as the associated rule contains words. All the binary values are set to false before a message is parsed. The message is parsed by reviewing word by word, starting with the first word….If a word matches a term fully, then all Boolean values that are pointed to by the pointers associated with the term that the word matches are set to true…. In a variant implementation, the system determines how many of the vectors are set to all-true; and outputs a counter corresponding to this number.”);

Leddy and Long are analogous art because they are both directed towards machine learning classifiers. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s rule generation system with Long’s rule comparison and review. The modification would have been obvious to one of ordinary skill in the art because they would have been motivated to decrease memory usage, which can be accomplished by pruning redundant rules (Long [0183]).

Leddy and Long disclose comparison of rules and user review of rules, but do not explicitly disclose comparison of rule data types.

Plesko teaches upon determining that the number of terms that match exceeds a threshold, determining whether differences between the first and second classification rules are significant based on determining whether the differences only include use of different data types (Column 1, lines 29-32 “These types can be checked versus a set of rules during compilation of a program written in the language. If the source code written in the typed language violates one of the type rules, a compiler error is determined.” Column 8 lines 46-60, “Assume for purpose of this example that I is a signed integer type, U is an unsigned integer type, X is either type of integer, F is float, and N is any of the above. FIG. 6 shows the hierarchical relationship between these types. Type N is at the top of the hierarchy. The types F and X branch down from type N to form the subsequent level of the hierarchy. Lastly, types U and I branch down from the X type to form the lowest level of the hierarchy. Thus, for an `ADD` intermediate language instruction, according to this rule only type N or lower in the hierarchy can be processed by the add instruction, and the operands must be no higher on the hierarchy than the result. For instance, two integers can be added to produce an integer (I=ADD i, i), or an integer and a float can be added to produce a float (F=ADD i, f). However, a float and an integer cannot be added to produce an integer (I=ADD i, f).” Figure 6. Examiner Note: The combination of Leddy and Long teaches the comparison of two classification rules (See Leddy 1922, Long 0183 above and Long 0163 below), but does not specifically disclose comparison of data types. Plesko teaches both weak and strong type checking, wherein a rule will flag an error if an incorrect type is used (but will not, if a compatible type is used). 
Long [0163] “The rule simplification module 738 takes as input a set of rules generated by the rule generation module 736 and removes rules obviated by other rules in the set. The rule simplification module 738 removes specific rules from the set that are obviated by one or more general rules in the set. A specific rule is obviated by a general rule if all communication authorized by the specific rule is also authorized by the general rule. For example, the rule simplification module 738 identifies (a) a general rule with a scope portion specifying a set of label values and (b) a specific rule with a scope portion specifying the same set of label values as the general rule as well as additional label values for additional label dimensions. In the example, the general rule covers all the label values for the additional dimensions, so the general rule obviates the specific rule. Simplifying the set of rules facilitates review and revision by an administrator. Alternatively, the rule generation module 736 checks whether a connection is authorized by a generated rule before generating a rule to authorize the connection to improve computational efficiency. In some embodiments, the rule simplification module 738 sends recommendations of proposed simplifications to an administrator through the rule creation interface 740 rather than performing rule simplification automatically. In some instances, the rule simplification module 738 receives a request from administrators to use un-simplified rules, for example, by managing different specific rules that could be obviated by general rule. As another example, the rule simplification module 738 may increase the granularity of the rules based on requests from an administrator (e.g., requests to specify additional labels or rule conditions).”); and
upon determining that the differences between the first and second classification rules are not significant because the differences only include use of different data types, refraining from suggesting the first classification rule to a user (Column 1, lines 29-32 “These .

Leddy, Long, and Plesko are analogous art because they are directed towards data processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s rule generation system with Long’s 

As per claim 2, the combination of Leddy, Long, and Plesko thus far teaches The method of claim 1.
Leddy teaches further comprising: determining that a second classification of the plurality of classifications applied to the first asset was applied to the first asset by a user ([1665] “In some embodiments, messages are forwarded to manual reviewers, for them to make a classification. The classification can be used to create a new rule, or to train the system. It can also be used to generate a classification decision for the manually reviewed message, and for closely related such messages in the manual review queue. The messages placed in the manual review queue can be selected based on one or more of the following:”); and
identifying a third classification rule generated by the ML algorithm ([1914] “The classification performed at 4204 can be performed using a variety of techniques. For example, a collection of terms can be evaluated using a rule-based approach (e.g., testing for the presence of words, and/or applying a threshold number of words whose presence are needed for a match to be found); using a support vector machine, where the elements of the support vector corresponds to .

Leddy does not specifically disclose determining that a number of terms that match and are present in both the first classification rule and the third classification rule exceeds a threshold; and outputting the first and third classification rules to the user with an indication suggesting to replace the third classification rule with the first classification rule.

determining that a number of terms that match and are present in the first classification rule and the third classification rule exceeds a threshold ([0183] “The communication rules creation module 390 may optionally simplify the set of generated rules to remove duplicate rules and specific rules obviated by general rules. The communication rules creation module 390 may also optionally group rules into rule sets corresponding to application groupings identified, e.g., by the application grouping module 727.” Examiner Note: Leddy teaches the counting of matching terms (See Leddy 1922 below) but does not teach applying the counting of matching terms to detect duplicate rules. Long teaches checking for duplicate rules. When Long is applied to Leddy, the resulting system would determine the number of terms that match and are present in two rules as a part of its duplicate rule detection. 
Leddy [1922] “In some embodiments, each of the non-equivalent terms in a collection of terms (e.g., “long lost” and “huge sum”) are associated with one or more pointers, and ordered alphabetically. The number of pointers associated with each term is the same as the number of rules for which that term is used. Each rule is represented as a vector of Boolean values, where the vector has the same length as the associated rule contains words. All the binary values are set to false before a message is parsed. The message is parsed by reviewing word by word, starting with the first word….If a word matches a term fully, then all Boolean values that are pointed to by the pointers associated with the term that the word matches are set to true…. In a variant implementation, the system determines how many of the vectors are set to all-true; and outputs a counter corresponding to this number.”); and
outputting the first and third classification rules to the user with an indication suggesting to replace the third classification rule with the first classification rule ([0163] “The rule simplification module 738 takes as input a set of rules generated by the rule generation module 736 and removes rules obviated by other rules in the set. The rule simplification module 738 .


Leddy, Long, and Plesko are analogous art because they are both directed towards. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s rule generation system with Long’s rule comparison and review and Plesko’s type checking. The modification would have been obvious to 

As per claim 3, the combination of Leddy, Long, and Plesko thus far teaches The method of claim 1. 
Leddy teaches further comprising: storing the first classification rule ([0134] “In some embodiments, authoring a new rule includes obtaining parameters (e.g., new URLs, phrases, etc.), and placing them into an appropriate format for a rule. In some embodiments, the new generated rules are stored in rules database 180. The rule is then loaded back into its appropriate, corresponding filter (e.g., a new URL rule is added to the URL filter, a new phrase rule is added to the phrase filter, etc.). In some embodiments, an update is made to an in-memory representation by the filter. Thus, filters can be dynamically updated.”);
determining a new asset has been added to the data catalog ([0269] “Storage (140) is configured to store and retrieve data, for example, in secondary storage. Storage (140) can include, but is not limited to, file systems, version control systems, and databases. Storage (140) can be used to store and retrieve communications received by a user and forwarded to the system for analysis, such as email messages (101). Storage (140) can also store and retrieve configuration (120).” [0851] “After Honeypot accounts (302) H are created, they can be exposed to scammers (301) in a variety of ways, as described herein. In some embodiments, when Scammers (301) send messages to these accounts, these messages are read (303), classified by type (305) and stage (306) based on the Rules and Rule Families they hit. In one embodiment, for each new message that arrives at a H (302), a search (307) for similar S message is made in the repository (308). In various embodiments. messages are matched based on one or more of the following…”[0854] “Matching Rules—Where message matches are based on the Rules that match both the new message and an S entry.”
;
determining that the new asset satisfies the condition specified in the first classification rule ([0851] “After Honeypot accounts (302) H are created, they can be exposed to scammers (301) in a variety of ways, as described herein. In some embodiments, when Scammers (301) send messages to these accounts, these messages are read (303), classified by type (305) and stage (306) based on the Rules and Rule Families they hit. In one embodiment, for each new message that arrives at a H (302), a search (307) for similar S message is made in the repository (308). In various embodiments. messages are matched based on one or more of the following…” [0854] “Matching Rules—Where message matches are based on the Rules that match both the new message and an S entry.” [0742] “TFIDF or variants can be used to return a quantitative judgment as to the likelihood that a newly arrived and heretofore unclassified message is scam or ham. Here, the TFIDF value assigned to a phrase correlates to the likelihood that a message containing it is a scam.”); and
programmatically applying the first classification to the new asset ([0851] “After Honeypot accounts (302) H are created, they can be exposed to scammers (301) in a variety of ways, as described herein. In some embodiments, when Scammers (301) send messages to these accounts, these messages are read (303), classified by type (305) and stage (306) based on the Rules and Rule Families they hit. In one embodiment, for each new message that arrives at a H (302), a search (307) for similar S message is made in the repository (308). In various embodiments. messages are matched based on one or more of the following…” [0854] “Matching Rules—Where message matches are based on the Rules that match both the new message and an S entry.”).

As per claim 4, the combination of Leddy, Long and Plesko thus far teaches The method of claim 1.
Leddy teaches further comprising: determining that a second classification of the plurality of classifications was programmatically applied to the first asset based on a third classification rule generated by the ML algorithm ([0276] “In one embodiment, messages are classified as red (130), yellow (131), or green (132), as described in further detail below. In some embodiments, messages classified as red are retained permanently without restriction for further analysis. Messages classified as yellow can be retained temporarily and can be subjected to additional Filtering or analysis. In some embodiments, messages classified as green are not retained in storage (120). Retained messages saved in storage (140) can be used, for example, for offline analysis by entities such as security analysts and researchers. The results can be reviewed to determine where changes to configuration (120) can be made to improve the accuracy of classification.” [0828] “Honeypot accounts can be created to collect spam and scam of particular types. Messages received for a particular type of honeypot account can then be classified based on the honeypot account type. For example, if the honeypot account is configured to attract scammers associated with inheritance scams, then messages associated with the honeypot account are more likely to be associated with inheritance scams. As another example, a honeypot account can be created to collect romance scams. For example, a honeypot account can be created by generating a dating profile on a dating site. Responses from potential scammers to the dating profile are then more likely to be romance scams, and are classified as such.” [0832] “In some embodiments, appropriate response(s) to messages are determined by the customize response 309 based on the results of the type classifier. For example, suppose that a message has been classified as a romance scam by the type classifier, the customize response selects a response from a set of romance responses (e.g., at random, based on a relevance match between the response and context/content of the message, etc.), which is then .

Leddy does not specifically disclose determining that a number of terms that match and are present in both the first classification rule and the third classification rule exceeds a threshold; and outputting the first and second classification rules to a user with an indication suggesting to replace the third classification rule with the first classification rule.

Long teaches determining that a number of terms that match and are present in the first classification rule and the third classification rule exceeds a threshold ([0183] “The communication rules creation module 390 may optionally simplify the set of generated rules to remove duplicate rules and specific rules obviated by general rules. The communication rules creation module 390 may also optionally group rules into rule sets corresponding to application groupings identified, e.g., by the application grouping module 727.” Examiner Note: Leddy teaches the counting of matching terms (See Leddy 1922 below) but does not teach applying the counting of matching terms to detect duplicate rules. Long teaches checking for duplicate rules. When Long is applied to Leddy, the resulting system would determine the number of terms that match and are present in two rules as a part of its duplicate rule detection. 
Leddy [1922] “In some embodiments, each of the non-equivalent terms in a collection of terms (e.g., “long lost” and “huge sum”) are associated with one or more pointers, and ordered alphabetically. The number of pointers associated with each term is the same as the number of rules ; and
outputting the first and third classification rules to the user with an indication suggesting to replace the third classification rule with the first classification rule ([0163] “The rule simplification module 738 takes as input a set of rules generated by the rule generation module 736 and removes rules obviated by other rules in the set. The rule simplification module 738 removes specific rules from the set that are obviated by one or more general rules in the set. A specific rule is obviated by a general rule if all communication authorized by the specific rule is also authorized by the general rule. For example, the rule simplification module 738 identifies (a) a general rule with a scope portion specifying a set of label values and (b) a specific rule with a scope portion specifying the same set of label values as the general rule as well as additional label values for additional label dimensions. In the example, the general rule covers all the label values for the additional dimensions, so the general rule obviates the specific rule. Simplifying the set of rules facilitates review and revision by an administrator. Alternatively, the rule generation module 736 checks whether a connection is authorized by a generated rule before generating a rule to authorize the connection to improve computational efficiency. In some embodiments, the rule simplification module 738 sends recommendations of proposed simplifications to an administrator through the rule creation interface 740 rather than performing rule simplification automatically. In some instances, the rule simplification module 738 receives a request from administrators to use un-.

Leddy, Long, and Plesko are analogous art because they are directed towards data processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s rule generation system with Long’s rule comparison and review and Plesko’s type checking. The modification would have been obvious to one of ordinary skill in the art because they would have been motivated to decrease memory usage, which can be accomplished by pruning redundant rules (Long [0183]).


As per claim 5, the combination of Leddy, Long, and Plesko teaches The method of claim 1.
Leddy teaches wherein the ML algorithm comprises one of: (i) a decision tree based classifier, (ii) a support vector machine, and (iii) an artificial neural network, wherein the ML algorithm generates the feature vector ([0136] “In some embodiments, the training module is configured to use machine learning techniques to perform training. For example, obtained messages/communications can be used as training/test data upon which authored rules are trained and refined. Various machine learning algorithms and techniques, such as support vector machines (SVMs), neural networks, etc. can be used to performing the training/updating.”).





As per claim 7, the combination of Leddy, Long, and Plesko teaches The method of claim 1.
Long teaches wherein the plurality of assets comprise: (i) a database, (ii) files, (iii) columns in the database, and (iv) a table in the database (Long, [0085] “Rule List #1/Rule #2 allows a web server to connect to the PostgreSQL service on a database server. Specifically, the allowance of a connection is specified by “Access Control” in the Function portion. The “web server” is specified by “<Role, Web>” in the UB portion. The “PostgreSQL” is specified by “PostgreSQL” in the Service portion. The “database server” is specified by “<Role, Database>” (a label set that includes only one label) in the PB portion.” [0162] “The rule generation module 736 may also generate inter-group rules corresponding to communication between application groups. An inter-group rule has a scope that applies only to the provider device of a service. For example, a rule may have a scope portion specifying the labels <Application, Human Resources>, <Environment, Production>, and <Location, US>. The example rule has a PB portion specifying <Role, Database> and a UB portion specifying <Role, Database>, <Application, Enterprise Resource Planning>, <Environment, Staging>, and <Location, Europe>. The labels specified in the UB portion override the labels along the same dimension in the scope portion. To generate an inter-group rule, the rule generation module 736 assigns group-level labels of the provider device to the scope portion and group-level labels of the consumer device to the UB portion. The rule generation module 736 may include additional labels in the UB portion and PB portion of the generated rule depending on the level of granularity specified by the administrator, as described above.” Examiner Note: Leddy specifies a database with tables necessarily containing columns in .

Leddy, Long, and Plesko are analogous art because they are directed towards data processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s rule generation system with Long’s rule comparison and review and Plesko’s type checking. The modification would have been obvious to one of ordinary skill in the art because they would have been motivated to decrease memory usage, which can be accomplished by pruning redundant rules (Long [0183]).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. 20200067861 A1 to Leddy et al, (hereinafter, “Leddy”), in view of U.S. Pub. No. 20170250915 A1 to Long et al., (hereinafter, “Long”), further in view of U.S. Pub. No. 7788652 B2 to Plesko et Tariti (hereinafter, “Plesko”), further in view of “An Unsupervised Machine Learning Approach to Body Text and Table of Contents Extraction from Digital Scientific Articles” to Klampfl et Kern, (hereinafter, “Klampfl”).

As per claim 6, the combination of Leddy, Long, and Plesko teaches The method of claim 1.

Leddy teaches wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a data format of each of the plurality of assets, (iii) a relationship between two or more of the plurality of classifications, (iv) a project to which each of the plurality of assets belong, (v) a data quality score computed for each of the plurality of assets, (vi) a set of tags applied to each of the plurality of assets, (vii) a name of each of the plurality of assets, (viii), a textual description of each of the plurality of assets, and (ix) a group of assets comprising a subset of the plurality of assets (Examiner Note: The examiner sees equivalents to the listed features as shown below:
the plurality of classifications = [0067] “In some embodiments, the results returned by individual filters can be combined in a variety of ways. For example, Boolean logic or arithmetic can be used to combine results. As one example, suppose that for a message, a rule from a romance scam family fired/hit, as well as a rule from a Nigeria family of scam rules. The results of rules/filters from both those families having been fired can be 
a type of each of the plurality of classifications = Examiner Note: The ‘color’ classifications (red, green, yellow, etc.) and the ‘kind’ classifications (romance, BEC, etc.) are seen as equivalent to different types of classifications.
a data format of each of the plurality of assets = [0062] “Messages 162 are obtained. The messages can include email, SMS, social network posts (e.g., Tweets, Facebook® messages, etc.), or any other appropriate type of communication.”
a relationship between two or more of the plurality of classifications = [0067-0068], a message being classified as a ‘Romance scam’ increases the likelihood that a message is also classified as ‘red’, etc..
a project to which each of the plurality of assets belong = [0068] “Similarly, the green bucket can be subdivided into messages of different priority, messages corresponding to different projects…”
a data quality score computed for each of the plurality of assets = [1413] “In another embodiment, if the scam training messages have a confidence factor associated with them (e.g, if the first message is classified scam with 90% probability, and the second with 50% 
a set of tags applied to each of the plurality of assets = [0310] “The response to potential scam messages can be varied by characteristics including but not limited to the email address, email domain, IP address, IP address range, country, internet hosting service, phone number, phone number blocks, instant messenger ID, message contents, or send rate.” 
a name of each of the plurality of assets = [0096] “2. Does the message have a high-risk word in its subject line?” Examiner Note: The subject of the message is seen as equivalent to the ‘name’ of the message.
a group of assets comprising a subset of the plurality of assets = [1245] “In some embodiments, the system accepts “batches” (subsets of datasets). An example dataset comprises input files listed in a file, for example, one file system path per line; or the data set can be identified by a file system directory, or a file containing a list of directories. In some embodiments, a batch is a dataset coupled with a desired maximum number of messages to be processed.”).

Leddy does not specifically disclose receiving a textual description of each of the plurality of assets.

Klampfl teaches The method of claim 1, wherein the plurality of features comprise: (i) the plurality of classifications, (ii) a type of each of the plurality of classifications, (iii) a data format of each of the plurality of assets, (iii) a relationship between two or more of the plurality of classifications, (iv) a project to which each of the plurality of assets belong, (v) a data quality score computed for each of the plurality of assets, (vi) a set of tags applied to each of the plurality of assets, (vii) a name of each of the plurality of assets, (viii), a textual description of each of the plurality of assets, and (ix) a group of assets comprising a subset of the plurality of assets (Page 144, Introduction, Lines 1-5. “As the growth of the global volume of scientific literature reaches unprecedented levels, there is an increasing demand for automated processing systems that support both librarians and researchers in managing collections of scholarly articles. The tasks of these systems range from the extraction of meta-data of a paper to the extraction of the table of contents and the body text, as well as named entities and facts contained therein.” Page 147, Categorization of Text Blocks, lines 5-8, “Meta-Data. To detect meta-data blocks, which contain information about the published article, e.g., the title, the journal, or the abstract, we reused previously published work [2] which uses sequence classifiers to detect the following types of meta-data blocks: Title, Journal, Author, Affiliation, Email, and Abstract.” Examiner Note: The examiner recognizes the abstract of a scholarly article as a textual description of the article as a whole. When combined with Leddy’s disclosed assets, this would result in Leddy’s system also including a textual description of the assets of the system.).

Leddy, Long, Plesko, and Klampfl are analogous art because they are directed towards data processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Leddy’s classification system with Klampfl’s analysis of scholarly articles, Long’s, rule comparison, and Plesko’s type checking. The modification would have been obvious to one of ordinary skill in the art because they would have been motivated to increase the flexibility of the system, which can be accomplished by making it compatible with scholarly articles (Klampfl, Introduction).

Conclusion
U.S. Pub. No. 20070112824 A1 to Lock et al, U.S. Pub. No. 8417709 B2 to Chiticariu et al, U.S. Pub. No. 20180157988 A1 to Jagota, U.S. Pub. No. 20160063386 A1 to Xie et al, U.S. Pub. No. 20160012352 A1 to Peng et al, U.S. Pub. No. 20170323112 A1 to Tran et al, and “A Review of Machine Learning Algorithms for Text-Documents Classification” to Khan et al.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL G SMITH whose telephone number is (571)272-9730. The examiner can normally be reached on Monday-Friday from 9:30 A.M. to 6:00 P.M. EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo, can be reached at telephone number 571-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
Respectfully Submitted,
                                                                                                                                                                       
/PAUL GORDON SMITH/Examiner, Art Unit 2126                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116