DETAILED ACTION
Summary and Status of Claims
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Office Action is in response to Application No. 16/896,895 filed 6/9/2020.
Claims 1-20 are pending.
Claims 1-20 are rejected under 35 U.S.C. 112(b).
Claims 1, 2, 5, 8, 9, 17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) in view of Chen et al. (US Patent Pub 2009/0006392).
Claims 3, 4, 6, and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) in view of Chen et al. (US Patent Pub 2009/0006392), further in view of Rosh et al. (US Patent Pub 2019/0044820).
Claims 10 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) in view of Chen et al. (US Patent Pub 2009/0006392), further in view of Ellis et al. (US Patent 2019/0102659).
Claims 13-16 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) in view of Chen et al. (US Patent Pub 2009/0006392), further in view of Malak et al. (US Patent Pub 2019/0385014).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) in view of Chen et al. (US Patent Pub 2009/0006392), further in view of Liu (US Patent Pub 2018/0189481).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):



Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claims 1, 4, 19, and 20 recite “… regular expression that is able to be used …” followed by some limitations.  This part of the limitation is indefinite because it can be interpreted as rendering the limitation optional.  Simply because a regular expression is “able to be used” to do something does not require that the step be performed.  See MPEP 2111.04(I).
Claim 12 recites “all or substantially all items” in line 2.  The specification does not describe what is considered to be “substantially all”.  As such, the limitation is a relative term with metes and bounds that cannot be reasonably ascertained by one of ordinary skill in the art.
The remaining claims are rejected because they depend on a rejected claim.

Note on Prior Art Rejections
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 5, 8, 9, 17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) (Jha) in view of Chen et al. (US Patent Pub 2009/0006392) (Chen).
In regards to claim 1, Jha discloses a method, comprising:
a.	determining a regular expression that is able to be used to identify an item as belonging to a specific group among a plurality of different groups (Jha at col. 2, lines 56-59; col. 4, lines 48-51; col. 10, lines 10-12)1;
c.	testing the regular expression against a sampling of items known to belong to other groups among the plurality of different groups outside the specific group to determine a false positive metric (Jha at col. 2, lines 59-67; col. 3, lines 1-6; col. 15, lines 16-39)2; 
d.	calculating the accuracy metric of the determined regular expression based at least in part on the false positive metric (Jha at col. 15, lines 34-39)3; and
e.	providing the accuracy metric for use in evaluating the regular expression.  Jha at col. 15, lines 38-39, 66-67; col. 16, lines 1-14.4
Jha does not expressly disclose testing the regular expression against a sampling of items known to belong to the specific group to determine a true positive metric and calculating the accuracy metric of the determined regular expression based at least in part on the true positive metric and the false positive metric.
Chen discloses a system and method for data profiling, where regular expressions are utilized to identify data patterns within attribute values.  Chen at paras. 0005.  The regular expressions are generated based on an input set of attribute values.  Chen at para. 0020.  The regular expressions are also evaluated to determine which regular expressions are best at determining patterns within a given set of attribute values.  The regular expression is evaluated to determine its quality, which is a combination of the percentage of attribute values it matches (i.e., testing against a sampling of items known to belong to the specific group to determine true positive metric) and percentage of values it matches that do not belong within the set of attribute values (i.e., false positives).  Chen at para. 0050.
Jha and Chen are analogous art because they are both directed to the same field of endeavor of using regular expressions for grouping/classifying data.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha by adding the features of testing the regular expression against a sampling of items known to belong to the specific group to determine a true positive metric and calculating the accuracy metric of the determined regular expression based at least in part on the true positive metric and the false positive metric, as disclosed by Chen.
The motivation for doing so would have been to ensure the regular expression is also good at matching items that it supposed to be matching.  Jha discloses other types of metrics can be used to determine the quality of the regular expressions.  Jha at col. 16, lines 10-14.  Chen discloses using both true positives and false positives to determine the quality of a regular expression for its intended purpose whereas the test used by Jha uses false positives only.  Modifying Jha with Chen results in an evaluation system that takes into consideration whether the regular expression is also good at matching items it was intended for.

In regards to claim 2, Jha in view of Chen discloses the method of claim 1, wherein determining the regular expression includes automatically generating the regular expression based on text data associated with the specific group.  Jha at Fig. 4; col. 11, lines 4-32.5
In regards to claim 5, Jha in view of Chen discloses the method of claim 1, wherein testing the regular expression against the sampling of items known to belong to the other groups among the plurality of different groups outside the specific group includes applying the regular expression to text data associated with the sampling of items.  Chen at paras. 0049-0050.6
In regards to claim 8, Jha in view of Chen discloses the method of claim 1, wherein the true positive metric corresponds to a number of items that the regular expression positively matches in the specific group.  Chen at para. 0050.
In regards to claim 9, Jha in view of Chen discloses the method of claim 1, wherein the false positive metric corresponds to a number of items that the regular expression positively matches in the other groups among the plurality of different groups outside the specific group.  Jha at col. 15, lines 34-38.  Chen at para. 0050.7
In regards to claim 17, Jha in view of Chen discloses the method of claim 1, wherein items belonging to the specific group and the other groups among the plurality of different groups outside the specific group have been grouped using data clustering.  Jha at col. 1, line 67; col. 2, lines 1-6.

In regards to claim 19, Jha discloses a system, comprising:
a.	one or more processors (Jha at col. 16, lines 59-61) configured to:
i.	determine a regular expression that is able to be used to identify an item as belonging to a specific group among a plurality of different groups (Jha at col. 2, lines 56-59; col. 4, lines 48-51; col. 10, lines 10-12)8;
iii.	test the regular expression against a sampling of items known to belong to other groups among the plurality of different groups outside the specific group to determine a false positive metric (Jha at col. 2, lines 59-67; col. 3, lines 1-6; col. 15, lines 16-39)9;
iv.	calculate an accuracy metric of the determined regular expression based at least in part on the false positive metric (Jha at col. 15, lines 34-39)10; and
v.	provide the accuracy metric for use in evaluating the regular expression (Jha at col. 15, lines 38-39, 66-67; col. 16, lines 1-14)11; and
b.	a memory coupled with the one or more processors and configured to provide the one or more processors with instructions (Jha at col. 16, lines 59-62).
Jha does not expressly disclose testing the regular expression against a sampling of items known to belong to the specific group to determine a true positive metric and calculating the accuracy metric of the determined regular expression based at least in part on the true positive metric and the false positive metric.
Chen discloses a system and method for data profiling, where regular expressions are utilized to identify data patterns within attribute values.  Chen at paras. 0005.  The regular expressions are generated based on an input set of attribute values.  Chen at para. 0020.  The regular expressions are also evaluated to determine which regular expressions are best at determining patterns within a given set of attribute values.  The regular expression is evaluated to determine its quality, which is a combination of the percentage of attribute values it matches (i.e., testing against a sampling of items known to belong to the specific group to determine true positive metric) and percentage of values it matches that do not belong within the set of attribute values (i.e., false positives).  Chen at para. 0050.
Jha and Chen are analogous art because they are both directed to the same field of endeavor of using regular expressions for grouping/classifying data.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha by adding the features of testing the regular expression against a sampling of items known to belong to the specific group to determine a true positive metric and calculating the accuracy metric of the determined regular expression based at least in part on the true positive metric and the false positive metric, as disclosed by Chen.
The motivation for doing so would have been to ensure the regular expression is also good at matching items that it supposed to be matching.  Jha discloses other types of metrics can be used to determine the quality of the regular expressions.  Jha at col. 16, lines 10-14.  Chen discloses using both true positives and false positives to determine the quality of a regular expression for its intended purpose whereas the test used by Jha uses false positives only.  Modifying Jha with Chen results in an evaluation system that takes into consideration whether the regular expression is also good at matching items it was intended for.

Claim 20 is essentially the same as claim 1 in the form of a computer program product embodied in a non-transitory computer readable storage medium (Jha at col. 16, lines 59-62).  Therefore, it is rejected for the same reasons.

Claims 3, 4, 6, and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) (Jha) in view of Chen et al. (US Patent Pub 2009/0006392) (Chen), further in view of Rosh et al. (US Patent Pub 2019/0044820) (Rosh).
In regards to claim 3, Jha in view of Chen discloses the method of claim 1, but does not expressly disclose wherein the specific group comprises a plurality of software processes.
Rosh discloses a system and method for automatically grouping similar applications and devices together using regular expressions that match the parameters of a device or application.  The regular expressions can be used to match arbitrary strings associated with devices/applications and any data in a configuration management database could potentially be associated with a device/application and used for the purpose of grouping the device/application using regular expressions.  Rosh at paras. 0004, 0130-132.
Jha, Chen, and Rosh are analogous art because they are all directed to the same field of endeavor of pattern matching to organize data.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein the specific group comprises a plurality of software processes, as disclosed by Rosh.
The motivation for doing so would have been to allow grouping of the devices and applications automatically and intelligently and provide a simplified view of a network map.  Rosh at para. 0004.

In regards to claim 4, Jha in view of Chen and Rosh discloses the method of claim 3, wherein the regular expression is able to be used to determine a database field corresponding to the specific group with which to populate a configuration management database.  Rosh at paras. 0081, 0132.12

In regards to claim 6, Jha in view of Chen discloses the method of claim 5, but does not expressly disclose wherein the text data includes commands for starting software processes.
Rosh discloses a system and method for automatically grouping similar applications and devices together using regular expressions that match the parameters of a device or application.  The regular expressions can be used to match arbitrary strings (e.g., software names)13 associated with devices/applications and any data in a configuration management database could potentially be associated with a device/application and used for the purpose of grouping the device/application using regular expressions.  Rosh at paras. 0004, 0130-132.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein the text data includes commands for starting software processes, as disclosed by Rosh.
The motivation for doing so would have been to allow grouping of the devices and applications automatically and intelligently and provide a simplified view of a network map.  Rosh at para. 0004.

In regards to claim 7, Jha in view of Chen discloses the method of claim 5, but does not expressly disclose wherein the text data includes parameters that specify configuration information for software processes.
Rosh discloses a system and method for automatically grouping similar applications and devices together using regular expressions that match the parameters of a device or application.  The regular expressions can be used to match arbitrary strings associated with devices/applications and any data in a configuration management database could potentially be associated with a device/application and used for the purpose of grouping the device/application using regular expressions.  Rosh at paras. 0004, 0130-132.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein the text data includes parameters that specify configuration information for software processes, as disclosed by Rosh.
The motivation for doing so would have been to allow grouping of the devices and applications automatically and intelligently and provide a simplified view of a network map.  Rosh at para. 0004.

Claims 10 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) (Jha) in view of Chen et al. (US Patent Pub 2009/0006392) (Chen), further in view of Ellis et al. (US Patent 2019/0102659) (Ellis).
In regards to claim 10, Jha in view of Chen discloses the method of claim 1, but does not expressly disclose wherein calculating the accuracy metric of the determined regular expression includes calculating a quotient comprising a numerator portion that is based at least in part of the true positive metric and a denominator portion that is based at least in part on the false positive metric.  Chen does disclose the quality metric of a regular expression is based on the two factors.  Chen at para. 0050.
Ellis discloses a system and method to improve accuracy of classifying objects using pattern matching.  Ellis at para. 0012.  The pattern matching model may be evaluated for accuracy.  The accuracy is calculated as a ratio between matches and mismatches (i.e., false positives).  The ratio value is compared to a threshold value and a user may be alerted to the accuracy of the model.  Ellis at para. 0035.
Jha, Chen, and Ellis are analogous art because they are all directed to the same field of endeavor of using pattern matching to organize data.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein calculating the accuracy metric of the determined regular expression includes calculating a quotient comprising a numerator portion that is based at least in part of the true positive metric and a denominator portion that is based at least in part on the false positive metric, as disclosed by Ellis.
The motivation for doing so would have been to ensure the pattern matching model is accurate within a desired threshold.  Ellis at para. 0035.  

In regards to claim 11, Jha in view of Chen and Ellis discloses the method of claim 10, wherein the numerator portion equals the true positive metric.  Chen at para. 0050.  Ellis at para. 0035.

Claims 13-16 are rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) (Jha) in view of Chen et al. (US Patent Pub 2009/0006392) (Chen), further in view of Malak et al. (US Patent Pub 2019/0385014) (Malak).
In regards to claim 13, Jha in view of Chen discloses the method of claim 1, but does not expressly disclose wherein providing the accuracy metric for use in evaluating the regular expression includes transmitting the accuracy metric to a user via a network.  Jha does disclose client devices connected to the system via a network.  Jha at col. 3, lines 65-67; col. 4, lines 1-19.  What is not disclosed is the accuracy metric is transmitted to the user.
Malak discloses a system and method for generating regular expressions with user assistance.  The system includes a process to improve and optimize the performance of generated regular expressions.  Malak at paras. 0065, 0067.  Malak provides a user interface to allow a user to input data to update a generated regular expression.  Once a regular expression is generated, it is tested against data items that it should match (i.e., true positive match) and ones it shouldn’t (i.e., false positive match).  Malak at para. 0078.  The user is notified if there is a failure to match the examples provided by the user (i.e., providing the accuracy metric to a user via a network) and is given the option of manually repairing the regular expression or changing the examples.  Malak at para. 0095.
Jha, Chen, and Malak are analogous art because they are all directed to the same field of endeavor of pattern matching to organize data.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein providing the accuracy metric for use in evaluating the regular expression includes transmitting the accuracy metric to a user via a network, as disclosed by Malak.
The motivation for doing so would have been to provide the user with notification of the regular expression not meeting expectations and giving the user options to remedy it.  Malak at para. 0095.

In regards to claim 14, Jha in view of Chen and Malak discloses the method of claim 1, wherein providing the accuracy metric for use in evaluating the regular expression includes providing a user with an option to manually adjust the regular expression.  Malak at para. 0095.
In regards to claim 15, Jha in view of Chen and Malak discloses the method of claim 1, further comprising recalculating the accuracy metric in response to a determination that the accuracy metric falls below a specified threshold.  Malak at para. 0095.14
In regards to claim 16, Jha in view of Chen discloses the method of claim 1, further comprising providing a suggestion to a user to manually adjust the regular expression in response to a determination that the accuracy metric falls below a specified threshold.  Malak at para. 0095.15

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Jha et al. (US Patent 10,956,522) (Jha) in view of Chen et al. (US Patent Pub 2009/0006392) (Chen), further in view of Liu (US Patent Pub 2018/0189481).
In regards to claim 18, Jha in view of Chen discloses the method of claim 17, but does not expressly disclose wherein the data clustering is associated with density-based spatial clustering of applications with noise.
Liu discloses a system and method for program file classification with the added benefit of reducing workload in identifying malicious program files.  Liu at para. 0006.  Liu utilizes regular expressions to normalize the directory paths of program files.  Liu at paras. 0087-88.  The server uses various methods to perform clustering of program files.  One type of clustering used is DBSCAN.  Liu at para. 0069.
Jha, Chen, and Liu are analogous art because they are all directed to the same field of endeavor of using regular expressions for pattern matching.
At the time before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to modify Jha in view of Chen by adding the feature of wherein the data clustering is associated with density-based spatial clustering of applications with noise, as disclosed by Liu.
The motivation for doing so would have been because Jha already discloses clustering based on density.  Jha at col. 13, lines 33-51.  DBSCAN is a particular method of density based clustering.

Allowable Subject Matter
Claim 12 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims with amendments to overcome their respective rejections under 112(b) as set forth above.

Additional Prior Art
Additional relevant prior art are listed on the attached PTO-892 form.  Some examples are:
Satish et al. (US Patent 9,330,258) discloses a system and method for identifying URLs linking to potentially malicious resources using regular expressions.
Zheng et al. (US Patent 20178/0187735) discloses a system and method for rating of patterns for pattern matching in a network system.
Jones et al. (US Patent Pub 2018/0063181) discloses a system and method for remote identification of enterprise threats using pattern matching.
Alexander et al. (US Patent Pub 2018/0341468) discloses a system and method for analyzing source codes using pattern matching.
Valgenti et al. (US Patent Pub 2019/0089723) discloses a system and method for automated signature generation and refinement for pattern matching.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL LE whose telephone number is (571)272-7970.  The examiner can normally be reached on M-F: 9:30am-6pm ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/MICHAEL LE/Examiner, Art Unit 2163                                                                                                                                                                                                        
/TONY MAHMOUDI/Supervisory Patent Examiner, Art Unit 2163                                                                                                                                                                                                        


    
        
            
        
            
        
            
    

    
        1 Regular expressions are used to identify whether a content item belongs in a particular cluster of non-compliant content items for a particular policy among a plurality of clusters for different policies (i.e., identify an item as belonging to a specific group among a plurality of different groups).
        2 Match rate (i.e., accuracy metric) for testing the regex against items that belong in groups outside of the specific group (i.e., false positives) is determined and output.
        3 The regular expression is matched against compliant content items (false positives) to determine a match rate (i.e., accuracy metric).
        4 The match rate is used by the training module (i.e., providing the accuracy metric) to determine whether the regex should be kept or removed.
        5 Content items have their text extracted and used to generate the regular expressions.
        6 Regular expressions are used to match patterns over input character classes (i.e., text data associated with the sampling of items).
        7 False positives.
        8 Regular expressions are used to identify whether a content item belongs in a particular cluster of non-compliant content items for a particular policy among a plurality of clusters for different policies (i.e., identify an item as belonging to a specific group among a plurality of different groups).
        9 Match rate (i.e., accuracy metric) for testing the regex against items that belong in groups outside of the specific group (i.e., false positives) is determined and output.
        10 The regular expression is matched against compliant content items (false positives) to determine a match rate (i.e., accuracy metric).
        11 The match rate is used by the training module (i.e., providing the accuracy metric) to determine whether the regex should be kept or removed.
        12 Rosh discloses any of the parameters of the CMDB can be associated with a device/application for grouping purposes.
        13 Arbitrary strings are interpreted as encompassing the name of the executable (i.e., command to start the process).
        14 Here, the specified threshold is 100% match success with desired examples with no false positives.  If it is below the threshold (i.e., includes false positives), then the regular expression is recalculated based on modifications.
        15 If the regular expression does not meet the specified threshold the user is notified to modify the regular expression.