DETAILED ACTION
This communication responsive to the Applicant’s amendment filed on 08/15/2022. Claims 1, 5-6, 9 and 11 have been amended. Claims 1-20 are pending and are directed towards SYSTEMS AND METHODS FOR SECURING DATA BASED ON DISCOVERED RELATIONSHIPS.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.



Claims 9 and 19 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 9 and 19 recites the limitation "the first set of one or more regular expressions and the second set of one or more regular expressions".  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1-9 and 11-19 are rejected under 35 U.S.C. 103 as being unpatentable over Oberbreckling et al. US 2018/0075104 A1 (hereinafter “Oberbreckling”) in view of Murray et al. US 2019/0102438 A1 (hereinafter “Murray”)

As per claim 1, Oberbreckling teaches a method comprising: 
receiving, by a query analytic system, at least one query that accesses data from a set of data objects, including a first data object and a second data object (The received data may include structured data, unstructured data, or a combination thereof..., the data sources can include a public cloud storage service 311, a private cloud storage service 313, various other cloud services 315, a URL or web-based data source 317, or any other accessible data source. A data enrichment request from the client 304 can identify a data source and/or particular data (tables, columns, files, or any other structured or unstructured data available through data sources 309 or client data store 307). Data enrichment service 302 may then access the identified data source to obtain the particular data specified in the data enrichment request. Oberbreckling, para [0063])( a dataset (e.g., a first dataset) may be accessed from a data source (e.g., a first data source. At block 654, a different dataset (e.g., a second dataset) may be access from a data source (e.g., a second data source) that is different from the data source accessed at block 652. A dataset may be accessed from a data source using one or more interfaces. Oberbreckling, para [0160]); 
wherein the at least one query was generated by an application to accomplish a function at an application layer (Each client device can include one or more client applications 304 through which the data enrichment service 302 can be accessed. For example, a browser application, a thin client (e.g., a mobile app), and/or a thick client can execute on the client device and enable the user to interact with the data enrichment service. Oberbreckling, para [0059]) (The relationships between any two compared datasets may be used to determine one or more recommendations for merging (e.g., joining), or “blending,” the data sets together. Oberbreckling, para [0114]); 
detecting, by the query analytic system, based on at least one access pattern associated with how the application layer accesses data from a data layer, a relationship between the first data object and the second data object that has not been previously defined at the data layer (pairs of columns [data layer], or column pairs are identified between the first dataset and the second dataset as having a potential relationship. The column pairs may be identified by data discovery. Oberbreckling, para [0161]) (data discovery engine 344 may implement one or more techniques for cross-source relationship discovery. Relationship discovery may include determining a relationship between a subset of data, such as a relationship between a pair of columns, or column pair, each column in a different dataset of the datasets that are compared. Given two datasets to process for relationship discovery, data discovery engine 344 can identify and recommends a ranked subset of column pairs between two compared datasets. The ranked column pairs identified as a relationship may be useful for blending the datasets with respect to those column pairs. Oberbreckling, para [0115]-[0119])( profile engine 326 can analyze data from a data source to determine whether any patterns exist, and if so, whether a pattern can be classified. Once data obtained from a data source is normalized, the data may be parsed to identify one or more attributes or fields in the structure of the data. Patterns may be identified using a collection of regular expressions. Oberbreckling, para [0077]-[0079])( Each score may be generated based on one or more scoring functions. Scoring functions are the independent variables in the predictive model which predict the dependent variable. The dependent variable is the prediction of how related, or “joinable” the columns might be to the user. Each scoring function may be based on a feature or classification for a comparison type of columns. A score may be generated for each scoring function as a feature of the column pair. Oberbreckling, para [0182]);
responsive to detecting the relationship, storing, by the query analytic system, an indication that identifies, at the data layer, the relationship between the first data object and the second data object that was not previously defined at the data layer (The columns of the selected column pair are identified as a pair, or keys, of a relationship between the accessed datasets. Oberbreckling, para [0151])(The transform script may include information identifying the columns of the pair as keys for a relationship. Oberbreckling, para [0152]); and 
Oberbreckling teaches performing an action on both datasets (a selection is received for a type of join function for one of the column pairs as a basis for merging or blending the datasets based on the relationship discovered for the selected column pair. Oberbreckling, para [0167]). However, Oberbreckling does not explicitly teach performing, based at least in part on the stored indication, an operation requested against the first data object also against the second data object.
However, Murray teaches performing, based at least in part on the stored indication, an operation requested against the first data object also against the second data object (if the system determines that a column of data in the second data set matches a column of data in the first data set, than the system can recommend actions for the column of data in the second data set based on the history information (e.g., recommended actions, recommended actions accepted by the user, and/or manual actions performed by the user) for the column in the first data set. The column of data in the second data set can be determined to match the column of data in the first data set based on the column level signatures or fingerprints for the column of data. The system will determine a data set from the metadata store that has a similar column level signature. Murray, para [0084]) 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to modify the teaching of Oberbreckling in view of Murray. One would be motivated to do so, to apply similar actions on similar datasets. 

As per claim 2, Oberbreckling and Murray teach the method of Claim 1, wherein the at least one access pattern comprises a co- occurrence of references to the first data object and the second data object within the at least one query (The metadata may be compared to identify similarities and/or to determine a type of the information. The information identified based on the data may be compared to know types of data (e.g., business information, personal identification information, or address information) to identify the data that corresponds to a pattern. Oberbreckling, para [0080]) (One of the identified patterns may be selected based on the difference. For example, one pattern may be disambiguated from another pattern based on a frequency of the patterns in the data. Oberbreckling, para [0082]) (Knowledge service 310 can implement a method to determine the semantic similarity between two or more datasets. This may also be used to match the user's data to reference data available through the knowledge service. Oberbreckling, para [0098]) (Knowledge service 310 can implement a method to determine the semantic similarity between the augmented data set and each category in knowledge source 340 to identify a name for the category. The name of the category may be chosen based on a highest similarity metric. The similarity metric may computed be based on the number of terms in the data set that match a category name. The category may be chosen based on the highest number of terms matching based on the similarity metric. Oberbreckling, para [0101]). 
 
As per claim 3, Oberbreckling and Murray teach the method of Claim 1, wherein the at least one access pattern comprises a join operation that joins at least a first portion of data in the first data object with at least a second portion of data in the second data object (The relationships between any two compared datasets may be used to determine one or more recommendations for merging (e.g., joining), or “blending,” the data sets together. Oberbreckling, para [0114]).  

As per claim 4, Oberbreckling and Murray teach the method of Claim 1, wherein different respective access patterns are assigned different respective weights based on how predictive the different respective access patterns are that the first data object and the second data object are related (Given two datasets to process for relationship discovery, data discovery engine 344 can identify and recommends a ranked subset of column pairs between two compared datasets. The ranked column pairs identified as a relationship may be useful for blending the datasets with respect to those column pairs. Oberbreckling, para [0115]) (Knowledge service 310 may provide, for each of the domains identified by knowledge service 310, a similarity metric indicating a degree of similarity to the domain. The techniques disclosed herein for similarity metric analysis and scoring can be applied by recommendation engine 308 to determine a classification of data processed by profile engine 326. The metadata generated by profile engine 326 may include information about the knowledge domain, if any are applicable, and a metric indicating a degree of similarity with the data analyzed by profile engine. Oberbreckling, para [0083])(Weights may be adjusted for a predictive model based on the success of the user choosing one of the ranked columns. The weights may be adjusted based on the selection of a column pair relative to the rank. For example, if the rank is useful such that a user selected a column pair having a high rank, then the weights may be maintained as useful because a highest ranking column pair is chosen. If a column pair is selected based on a lower rank, then weights may be adjusted to favor one or more scoring functions to improve ranking of the column pairs closer to the selection. Oberbreckling, para [0150]).  

As per claim 5, Oberbreckling and Murray teach the method of Claim 4, further comprising: training a machine- learning model to learn the different respective weights for the different respective access patterns based, at least in part, on feedback provided by a user that labels two or more data objects as related or unrelated (The process(es) may include employing a predictive model to score and/or rank the remaining column pairs as an indication of possible relationships between two datasets. Oberbreckling, para [0118]) (keys for relationships between data sets are discovered and users verify relationships based on confidence metrics and previews. Oberbreckling, para [0117])( data discovery may implement learning techniques (e.g., machine learning or user-assisted learning) to identify a pattern, such as certain types of relationships between columns. For example, similar relationships can be identified based on the previous relationships using the predictive model. Oberbreckling, para [0156])( a score is generated for each column pair according to the selected predictive model. The predictive model is applied for a specific join type to predict pair relationship rank. As a part of applying the model, one or more scores is determined for each comparison. Each score may be generated based on one or more scoring functions. Each scoring function may be based on a feature or classification for a comparison type of columns. A score may be generated for each scoring function as a feature of the column pair. Examples of features include, without restriction or limitation by order, “byColumnType,” “byValueUniquenessLeft,” “byValueUniquenessRight,” “byExampleCharacterSequences,” “byExampleValues,” “byUniqueExampleOverlap,” “byHeaderNameMatch,” “byHistograms,” or a combination thereof. Each score may be weighted based on a weight defined by a model for the comparison. A column pairs final score is based on a combination (e.g., a summation) of the weighted score for each scoring function. Oberbreckling, para [0146]-[0147]).  

As per claim 6, Oberbreckling and Murray teach the method of Claim 1, further comprising: 
determining an amount of overlap between data stored in the first data object and data stored in the second data object (The profile metadata for column pairs that are compared enables data discovery engine 344 to determine a relationship, such as any overlap between columns. Oberbreckling, para [0118]) (knowledge service 310 can determine the similarity of two data sets A and B, by determining the ratio of the size of the intersection of the data sets to the size of the union of the data sets. For example, a similarity metric may be computed based on the ratio of 1) the size of the intersection of an data set (e.g., an augmented data set) and a category and 2) the size of their union. Oberbreckling, para [0104]); wherein the indication is further stored responsive to detecting that the amount of overlap satisfies a threshold (A threshold can be set in recommendation engine 318 such that matches having a confidence score greater than the threshold are applied automatically. Oberbreckling, para [0131])( Column pairs may be filtered based on a threshold by which a column pair is identified as being unlikely to have a relationship between columns in the column pair. Column pairs may be filtered using the profile data. A column pair is excluded based on determining that no join can be performed between the columns in the column pair. Oberbreckling, para [0144][0174]).  

As per claim 7, Oberbreckling and Murray teach the method of Claim 1, wherein the indication is further stored based at least in part on detecting that the first data object and the second data object have matching metadata or metadata patterns (The metadata itself may indicate information about the data. The metadata may be compared to identify similarities and/or to determine a type of the information. The information identified based on the data may be compared to know types of data (e.g., business information, personal identification information, or address information) to identify the data that corresponds to a pattern. Oberbreckling, para [0080]) (The profile metadata for column pairs that are compared enables data discovery engine 344 to determine a relationship, such as any overlap between columns. Oberbreckling, para [0118]).  

As per claim 8, Oberbreckling and Murray teach the method of Claim 1, further comprising: 
matching a first set of one or more regular expressions to data stored in the first data object and a second set of one or more regular expressions to metadata associated with the first data object (profile engine 326 may identify patterns in data based on a set of regular expressions defined by semantic constraints or syntax constraints constraints. A regular expression may be used to determine the shape and/or structure of data. Profile engine 326 may implement operations or routines (e.g., invoke an API for routines that perform processing for regular expressions) to determine patterns in data based on one or more regular expressions. For example, a regular expression for a pattern may be applied to data based on syntax constraints to determine whether the pattern is identifiable in the data. Oberbreckling, para [0077]-[0079]); 
responsive to matching the first set of one or more regular expressions and the second set of one or more regular expressions, classifying at least one of the first data object or the second data object as storing sensitive data (The data may be compared to entity information for different entities to determine if there is a match with one or more entities based on the identified pattern. For example, the knowledge service 310 can match the pattern “XXX-XX-XXXX” to the format of U.S. social security numbers. Furthermore, the knowledge service 310 can determine that social security numbers are protected or sensitive information, the disclosure of which is linked to various penalties. In some embodiments, profile engine 326 can perform statistical analysis to disambiguate between multiple classifications identified for data processed by profile engine 326. For example, when text is classified with multiple domains, profile engine 326 can process the data to statistically determine the appropriate classification determined by knowledge service 310. The statistical analysis of the classification can be included in the metadata generated by profile engine 326. Oberbreckling, para [0086]-[0089]), 
wherein the operation comprises a security operation to protect the sensitive data (the recommendation engine 308 can generate transform recommendations based on the matched patterns received from the knowledge service 310. For example, for the data including social security numbers, the recommendation engine can recommend a transform that obfuscates the entries (e.g., truncating, randomizing, or deleting, all or a portion of the entries). Other examples of transformation may include, reformatting data (e.g., reformatting a date in data), renaming data, enriching data (e.g., inserting values or associating categories with data), searching and replacing data (e.g., correcting spelling of data), change case of letter (e.g., changing a case from upper to lower case), and filter based on black list or white list terms. Oberbreckling, para [0092]).  

As per claim 9, Oberbreckling and Murray teach the method of Claim 1, further comprising- generating a confidence score that the first data object stores sensitive information based, at least in part, on the first set of one or more regular expressions and the second set of one or more regular expressions (knowledge service 310 can provide a confidence score for a given pattern match. A threshold can be set in recommendation engine 318 such that matches having a confidence score greater than the threshold are applied automatically. Oberbreckling, para [0131]) (Profile engine 326 may perform parsing operations using one or more regular expressions to identify patterns in data processed by profile engine 326. Regular expressions may be ordered according to a hierarchy. Patterns may be identified based on order of complexity of the regular expressions. Multiple patterns may match data that is being analyzed; the patterns having the greater complexity will be selected, profile engine 326 may perform statistical analysis to disambiguate between patterns based on the application of regular expressions that are applied to determine those patterns. Oberbreckling, para [0079]).  

Claims 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Oberbreckling et al. US 2018/0075104 A1 (hereinafter “Oberbreckling”) in view of Murray et al. US 2019/0102438 A1 (hereinafter “Murray”) and further in view of Stevens et al. US 2006/0095466 A1 (hereinafter “Stevens”)

As per claim 10, Oberbreckling and Murray teach the method of Claim 1. Oberbreckling does not explicitly teach wherein the indication identifies at least one of a primary-foreign key relationship or a composite key relationship between the first data object and the second data object. 
However, Stevens teaches wherein the indication identifies at least one of a primary-foreign key relationship or a composite key relationship between the first data object and the second data object (One way to define a "relationship" between two data objects is through a "primary-key/foreign-key" relationship. A "primary-key" for a category is an annotation rule (or rules) whose value (or values taken together) uniquely identifies each data object in the category. A first data object has a primary-key/foreign-key relationship to a second data object (in a different category or in the same category) when the second data object has an annotation rule value that references a primary-key value of the first data object. This annotation rule for the second data object is called a "foreign-key." Stevens, para [0034])
 Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to modify the system of Oberbreckling in view of Stevens. One would be motivated to do so, to define a relationship between two datasets (Stevens, para [0034]).

As per claims 11-20, are rejected using similar prior art and rational used to reject claims 1-10 respectively. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KHALID M ALMAGHAYREH whose telephone number is (571)272-0179. The examiner can normally be reached Monday - Thursday 8AM-5PM EST & Friday variable.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SALEH NAJJAR can be reached on (571)272-4006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





Respectfully Submitted

/KHALID M ALMAGHAYREH/Examiner, Art Unit 2492