Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Compact Prosecution
Examiner would like to propose amending the independent claims to include the limitation: 
a label index; wherein the label index indicate which fields in the dataset include PII, the application can mask just those fields with PII as needed and  the application can access the data store storing the dataset fewer times, and less data being transmitted to reduce bandwidth usage. 
      a user interface that provides feedback to a user by displaying reports about what data fields are labeled and with what probability, score and weight each dataset is classified.  
This amendment will overcome the current rejection.
  
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 20 and 21 have been considered but are moot because the new ground of rejection does not rely on the combination of references (Walters US10459954 and Lorrain-Hale US20200301950) applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
profiling, by a data processing system, one or more data values of the field to generate a data profile;  (Col. 14, lines 19-22- The data profiling module generates a data profile by identifying the data schema from the received dataset)  accessing a plurality of label proposal tests; based on applying at least the plurality of label proposal tests to the data profile, (Col. 14 lines 13-19- thus “data profile module may identify a data schema.. where the data schema may include at least one of a label, a field or an index”--- thus based on a plurality of data schema at least one of them is applied) generating a set of label proposals; (Col. 13 lines 9-12 thus labels such as actual data, fully composed of synthetic data or partially composed of synthetic data are generated as the set of label proposals) 

    PNG
    media_image1.png
    198
    200
    media_image1.png
    Greyscale

receiving by a dataset … a request to identify a cluster (label tests) of connected datasets among the received plurality of datasets. Col. 3 lines 25-30- thus a clustering or classifying dataset reads on label proposal test (classifies or clusters are known in the art to be the same). 
“a label may indicate whether one or more data elements are actual data, synthetic data, relevant data or another category” 




determining a similarity among the label proposals in the set of label proposals; 
based at least on the similarity among the label proposals in the set, (Col. 12 lines 1-4 “similarities between dataset and a previously classified dataset are used to classify the dataset based on labels shown in Fig. 1 as shown above)  selecting a classification; based on the classification, rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required; (Col. 9 lines 56-58- thus “Clustered datasets (Classified data)” thus classifying the received dataset based the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data as shown on fig. reads on label test)  
identifying one of the label proposals as identifying the semantic meaning of the data of the field. (identifying one of the labels set in fig. 1 above) and grouping (Col. 16 lines 35-36- thus storing the segmented cluster of  datasets in a data storage)
(Understand that in Col. 13 lines 9-12- it is based on the similarity in the schema of the datasets that classifies the dataset into one of the labels such as actual data, fully composed of synthetic data or partially composed of synthetic data- see fig. 1 above))  
(Also Understand the labels tests classifies the dataset if the datasets are actual data or fully or partially synthetic data (label tests))  

Walters does not disclose rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required; 
Lorrain-Hale discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required. (Lorrain-Hale: Section 0056, lines 8-10 displaying the suggestions alongside the content pane (document such as invoice)) 


    PNG
    media_image2.png
    319
    515
    media_image2.png
    Greyscale



    PNG
    media_image3.png
    324
    600
    media_image3.png
    Greyscale






The 4 suggested tags reads on the label proposal.
The user can approve or decline a suggested tag (label proposal) by selecting the option (“add as tag”)


Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of given a user the option to add a suggested tag or label. The motivation is that the user has the flexibility of choosing a label that is appropriate. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1,3,5,6-13, 15-17 and 20-30  are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US10459954) in view of  Lorrain Hale (US20200301950). Refer to as Walter in view of Lorrain Hale from here on in this document. 
Claim 1, Walters discloses a method implemented by a data processing system (Fig. 1, Col. 4, lines 52-55- System 100) for discovering a semantic meaning of data of a field included in one or more data sets, ( Col. 13, lines 48-50- based on the schema (label or field) the data element (semantic meaning) is discovered-indicated)  the method including: 
identifying a field included in one or more data sets, (Col. 14, lines 13-22, - thus the data schema (Label or field) is identified in the dataset) with the field associated with an identifier; (Col. 9 lines 49-50 “indicator of whether data element is actual data or synthetic)  and for that field profiling by a data processing system one or more data values of the field to generate a data profile; (Col. 14, lines 19-22- The data profiling module generates a data profile by identifying the data schema from the received dataset) 
accessing a plurality of label proposal tests based on applying at least the plurality of label proposal tests to the data profile, (Col. 14 lines 13-19- thus “data profile module may identify a data schema.. where the data schema may include at least one of a label, a field or an index”--- thus based on a plurality of data schema at least one of them is applied) generating a set of label proposals; (Col. 13 lines 9-12 thus labels such as actual data, fully composed of synthetic data or partially composed of synthetic data are generated as the set of label proposals)

    PNG
    media_image1.png
    198
    200
    media_image1.png
    Greyscale

receiving by a dataset … a request to identify a cluster (label tests) of connected datasets among the received plurality of datasets. Col. 3 lines 25-30- thus a clustering or classifying dataset reads on label proposal test (classifies or clusters are known in the art to be the same). 
“a label may indicate whether one or more data elements are actual data, synthetic data, relevant data or another category” 




Figure 1: the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data reads on the plurality of label proposal tests.  

determining a similarity among the label proposals in the set of label proposals based at least on the similarity among the label proposals in the set, (Col. 12 lines 1-4 “similarities between dataset and a previously classified dataset are used to classify the dataset based on labels shown in Fig. 1 as shown above) selecting a classification; (Col. 13 lines 5-12- thus “dataset connector system classify dataset based on the data schema… or edges”- classifying dataset means a classification is selected for that dataset) 
based on the classification identifying a label proposal that identifies the semantic meaning or determining that no input is required; (Col. 9 lines 56-58- thus “Clustered datasets (Classified data)” thus classifying the received dataset based the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data as shown on fig. reads on label test) 
identifying one of the label proposals as identifying the semantic meaning; (identifying one of the labels set in fig. 1 above) and grouping (Col. 16 lines 35-36- thus storing the segmented cluster of  datasets in a data storage) the identifier of the field with the identified one of the label proposals that identifies the semantic meaning. (Col. 13, lines 9-12- thus based on the similarity in the schema of the datasets, the datasets may be classified as one of the labels (where the labels are either fully composed of actual data, fully composed of synthetic data or partially composed of synthetic data- see fig. 1 above)) 
(Understand the labels tests classifies the dataset if the datasets are actual data or fully or partially synthetic data (label tests)) 
Walter does not discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required 
Lorrain-Hale discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required. (Lorrain-Hale: Section 0056, lines 8-10 displaying the suggestions alongside the content pane (document such as invoice))


    PNG
    media_image2.png
    319
    515
    media_image2.png
    Greyscale



    PNG
    media_image3.png
    324
    600
    media_image3.png
    Greyscale


The 4 suggested tags reads on the label proposal.
The user can approve or decline a suggested tag (label proposal) by selecting the option (“add as tag”)



Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of given a user the option to add a suggested tag or label. The motivation is that the user has the flexibility of choosing a label that is appropriate. 

Claim 3, Walters in view of Lorrain Hale discloses wherein profiling the data values of the field includes determining a statistical value representing the data values included in the field. (Walter: Col. 10 Lines 19-21- Thus the statistical profile of the dataset reads on the statistical value of the field) 
Claim 5, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes determining that the field includes a primary key for a data set of the one or more data sets; (Walter: Col. 3 lines 10-14- “foreign Key”) and 
selecting a label proposal test of the plurality of label proposal tests that are that is related to the primary key. (Walter: Col. 11 lines 7-10- thus labels associated with the candidate foreign key is selected for the dataset)
Claim 6, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes performing a metadata (Walter: Col. 4 lines 26-28 “metadata”) comparison of data values of the field to terms in a glossary of terms. (Walter: Col. 3 lines 35-36 data mapping reads on the metadata comparison, regarding glossary of terms, the directory disclosed in Col. 4 lines 26-28 reads on it.) 
Claim 7, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes determining from the data profile a pattern represented by the data values stored of the field (Walter: Col. 11 lines 25-27- thus datasets stored in clustered dataset reads on stored of the field)  determining a particular label that is mapped to the pattern and labeling the field with the particular label. (Walter: Col. 11 lines 20-27 “Data mapping maps the received dataset  based on similarities such as parent-child relationships (pattern represented by data stored in the field))
Claim 8, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes retrieving a list of values that are representative of a data collection; (Walter: Col. 11 lines 35-40- retrieving a data mapping model used for a dataset reads on the data collection) comparing the data values of the field to the list of values; determining, in response to the comparing, (Walter: Col. 11 lines 43-46 Mapping reads on comparing)  that a threshold number of the data values match the values of the list and in response to the determining, labeling the field with a particular label that specifies the data collection. (Walter: Col. 11 lines 43-46- thus “…the statistical similarity metric with one of the received datasets that meets a threshold criterion”- this means the mapping module maps/compares received dataset to a threshold value to cluster the dataset in that group or assign that label) 
Claim 9, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes generating at least two labels for the field; (Walter: Col. 8 lines 40-42- thus a label can have at least actual data, synthetic data field and relevant data field so at least two fields can be generated from the labels)  and 
determining whether the at least two labels are exclusive or inclusive of one another. (Walter: Fig. 4 element 430 shows that one dataset can have plurality of relationships meaning one dataset can be inclusive). 
Claim 10, Walters in view of Lorrain Hale discloses further including determining, in response to applying the plurality of label proposal tests, a relationship between the field and another field of the one or more data sets. (Walter: Col. 12 lines 25-31- thus the Arrows and distance between discs showed on Fig. 4 represents data or field relationships between the dataset where dataset are connected).
Claim 11, Walters in view of Lorrain Hale discloses wherein the relationship includes one of an indication that a first data value the field determines a second data value stored in the other field, (Walter: the shade shown in  Fig. 4 represents data that are stored in fields)  an indication that the first data value correlates to the second data value, or an indication that the first data value is identical to the second data value. (Walter: Col. 12 lines 27-32- thus “Arrows and distance between discs represents aspects of data relationships between the dataset and shading represents classification of the datasets” this means the arrows indicates that the dataset connected are identical. It also means that the data values are identical) 
Claim 12, Walters in view of Lorrain Hale discloses wherein the plurality of label proposal tests are each associated with at least one weight value, (Walter: foreign key scores for the dataset- Col. 15 lines 1-3)  the method further including updating a weight value associated with at least one label proposal test; and 
reapplying the label proposal test to the data profile using the updated weight value. (Walter: Col. 2 lines 10-12- discloses using neural network models such as recurrent neural network and deep learning models which teaches constantly updating the parameters for classification which in this case is an example of the schema the foreign key scores- Col. 12 lines 20-25) 
Claim 13, Walters in view of Lorrain Hale discloses further including training the plurality of label proposal tests using a machine learning process. (Walter: Col. 10 lines 53-58- thus the mapping modules includes machine learning models) 
Claim 15, Walters in view of Lorrain Hale discloses wherein comparing the label proposals generated from the label proposal tests includes applying a score value to each label proposal for each label of the label proposals combining the score values associated with that label; (Walter: Col. 15 lines 21-25- thus “generating a plurality of edges between the selected dataset  and the received dataset based on foreign key scores.. this means each clustered dataset has an assigned key score/value) and ranking the labels according to the score value associated with each label. (Walter: pluralities of edges generated between the dataset based on score means the dataset are ranked or classified based on the scores)
Claim 16, Walters in view of Lorrain Hale discloses receiving validation of the label proposals from the plurality of label proposal tests (Walter: Col. 9 lines 56-58- thus “Clustered datasets (Classified data)” thus classifying the received dataset based the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data as shown on fig. reads on label test) 
 and responsive to receiving the validation weighting the plurality of label proposal tests with the label proposals. (Walter: Col. 14 lines 9-12 “label classifying the dataset as relevant to an analysis goal or topic”  this means the labeled dataset are scored) 
Claim 17, Walters in view of Lorrain Hale discloses wherein the data store includes a data dictionary. (Walter: Col. 14 lines 17-19- thus the directory of words or data reads on the dictionary) 
Claim 20, Walters discloses a data processing system (Fig. 1, Col. 4, lines 52-55- System 100) for discovering a semantic meaning of a field included in one or more data sets, ( Col. 13, lines 48-50- based on the schema (label or field) the data element (semantic meaning) is discovered-indicated)  the system including: 
a data storage storing instructions; and at least one processor configured to execute the instructions stored by the data storage (Col. 3 lines 44-47 – thus Non-Transitory computer readable storage media that stores program instructions)  to perform operations including identifying a field included in one or more data sets, (Col. 14, lines 13-22, - thus the data schema (Label or field) is identified in the dataset) with the field having an identifier and for the field. (Col. 9 lines 49-50 “indicator of whether data element is actual data or synthetic) profiling, by a data processing system, one or more data values of the field to generate a data profile; (Col. 14, lines 19-22- The data profiling module generates a data profile by identifying the data schema from the received dataset) 
accessing a plurality of label proposal tests based on applying at least the plurality of label proposal tests to the data profile, (Col. 14 lines 13-19- thus “data profile module may identify a data schema.. where the data schema may include at least one of a label, a field or an index”--- thus based on a plurality of data schema at least one of them is applied) generating a set of label proposals; (Col. 13 lines 9-12 thus labels such as actual data, fully composed of synthetic data or partially composed of synthetic data are generated as the set of label proposals). 

    PNG
    media_image1.png
    198
    200
    media_image1.png
    Greyscale

receiving by a dataset … a request to identify a cluster (label tests) of connected datasets among the received plurality of datasets. Col. 3 lines 25-30- thus a clustering or classifying dataset reads on label proposal test (classifies or clusters are known in the art to be the same). 
“a label may indicate whether one or more data elements are actual data, synthetic data, relevant data or another category”


Figure 2: Figure 3: the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data reads on the plurality of label proposal tests.  

determining a similarity among the label proposals in the set of label proposals; 
based at least on the similarity among the label proposals in the set, (Col. 12 lines 1-4 “similarities between dataset and a previously classified dataset are used to classify the dataset based on labels shown in Fig. 1 as shown above) selecting a classification; (Col. 13 lines 5-12- thus “dataset connector system classify dataset based on the data schema… or edges”- classifying dataset means a classification is selected for that dataset) 
based on the classification, rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required; (Col. 9 lines 56-58- thus “Clustered datasets (Classified data) may include graphical data” this means the Clustered datasets can be represented using a graphical data) 

identifying one of the label proposals as identifying the semantic meaning; (identifying one of the labels set in fig. 1 above) and 
the identifier of the field with the identified one of the label proposals that identifies the semantic meaning of the data field. (Col. 13, lines 9-12- thus based on the similarity in the schema of the datasets, (semantic meaning)  the datasets may be classified as one of the labels (where the labels are either fully composed of actual data, fully composed of synthetic data or partially composed of synthetic data- see fig. 1 above)) 
(Understand the labels tests classifies the dataset if the datasets are actual data or fully or partially synthetic data (label tests)) 
Walter does not discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required 
Lorrain-Hale discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required. (Lorrain-Hale: Section 0056, lines 8-10 displaying the suggestions alongside the content pane (document such as invoice))


    PNG
    media_image2.png
    319
    515
    media_image2.png
    Greyscale



    PNG
    media_image3.png
    324
    600
    media_image3.png
    Greyscale



The 4 suggested tags reads on the label proposal.
The user can approve or decline a suggested tag (label proposal) by selecting the option (“add as tag”)

Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of given a user the option to add a suggested tag or label. The motivation is that the user has the flexibility of choosing a label that is appropriate. 

Claim 21, Walters discloses One or more non-transitory computer readable media storing instructions (Col. 3 lines 44-47 – thus Non-Transitory computer readable storage media that stores program instructions) for discovering a semantic meaning of a field included in one or more data sets, ( Col. 13, lines 48-50- based on the schema (semantic meaning) the data element (semantic meaning) is discovered-indicated) the instructions being executable by one or more processors (Col. 3 lines 44-47) configured to perform operations including:
identifying a field included in one or more data sets, (Col. 14, lines 13-22, - thus the data schema is identified in the dataset) with the field having an identifier; (Col. 9 lines 49-50 “indicator of whether data element is actual data or synthetic (Partially or fully)) and for that field profiling by a data processing system, one or more data values of the field to generate a data profile; (Col. 14, lines 19-22- The data profiling module generates a data profile by identifying the data schema from the received dataset) 
accessing a plurality of label proposal tests; based on applying at least the plurality of label proposal tests to the data profile, (Col. 14 lines 13-19- thus “data profile module may identify a data schema.. where the data schema may include at least one of a label, a field or an index”--- thus based on a plurality of data schema at least one of them is applied) generating a set of label proposals; (Col. 13 lines 9-12 thus labels such as actual data, fully composed of synthetic data or partially composed of synthetic data are generated as the set of label proposals)


    PNG
    media_image1.png
    198
    200
    media_image1.png
    Greyscale

receiving by a dataset … a request to identify a cluster (label tests) of connected datasets among the received plurality of datasets. Col. 3 lines 25-30- thus a clustering or classifying dataset reads on label proposal test (classifies or clusters are known in the art to be the same). 
“a label may indicate whether one or more data elements are actual data, synthetic data, relevant data or another category”


Figure 4: the Unclassified, Fully Actual, Partially Synthetic and Fully Synthetic Data reads on the plurality of label proposal tests.  
 

determining a similarity among the label proposals in the set of label proposals; based at least on the similarity among the label proposals in the set, (Col. 12 lines 1-4 “similarities between dataset and a previously classified dataset are used to classify the dataset based on labels shown in Fig. 1 as shown above) selecting a  classification; (Col. 13 lines 5-12- thus “dataset connector system classify dataset based on the data schema… or edges”- classifying dataset means a classification is selected for that dataset) 
based on the classification, rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required; (Col. 9 lines 56-58- thus “Clustered datasets (Classified data) may include graphical data” this means the Clustered datasets can be represented using a graphical data) 
(regarding the graphical user interface the secondary reference also addresses this limitation- please see Fig. 3, section 0032, lines 10-14 in Procops)
identifying one of the label proposals as identifying the semantic meaning; (identifying one of the labels set in fig. 1 above) and the identifier of the field with the identified one of the label proposals that identifies the semantic meaning of the data field. (Col. 13, lines 9-12- thus based on the similarity in the schema of the datasets, the datasets may be classified as one of the labels (where the labels are either fully composed of actual data, fully composed of synthetic data or partially composed of synthetic data- see fig. 1 above)) 
(Understand the labels tests classifies the dataset if the datasets are actual data or fully or partially synthetic data (label tests)) 
Walter does not discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required 
Lorrain-Hale discloses rendering a graphical user interface that requests input in identifying a label proposal that identifies the semantic meaning or determining that no input is required. (Lorrain-Hale: Section 0056, lines 8-10 displaying the suggestions alongside the content pane (document such as invoice))


    PNG
    media_image2.png
    319
    515
    media_image2.png
    Greyscale



    PNG
    media_image3.png
    324
    600
    media_image3.png
    Greyscale


The 4 suggested tags reads on the label proposal which the user interface gives the user the opportunity to either confirm or decline.




The user can approve or decline a suggested tag (label proposal) by selecting the option (“add as tag”)


Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of given a user the option to add a suggested tag or label. The motivation is that the user has the flexibility of choosing a label that is appropriate. 
Claim 22, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes determining that the field includes a primary key for a data set of the one or more data sets; (Walter: Col. 3 lines 10-14- “foreign Key”) and selecting a label proposal test of the plurality of label proposal tests that are related to the primary key. (Walter: Col. 11 lines 7-10- thus labels associated with the candidate foreign key is selected for the dataset)

Claim 23, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes: performing a metadata (Walter: Col. 4 lines 26-28 “metadata”) comparison of data values of the field to terms in a glossary of terms. (Walter: Col. 3 lines 35-36 data mapping reads on the metadata comparison, regarding glossary of terms, the directory disclosed in Col. 4 lines 26-28 reads on it.) 
Claim 24, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes determining, from the data profile, a pattern represented by the data values stored of the field; (Walter: Col. 11 lines 25-27- thus datasets stored in clustered dataset reads on stored of the field)  determining a particular label that is mapped to the pattern; and labeling the field with the particular label. (Walter: Col. 11 lines 20-27 “Data mapping maps the received dataset  based on similarities such as parent-child relationships (pattern represented by data stored in the field))
Claim 25, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes retrieving a list of values that are representative of a data collection; (Walter: Col. 11 lines 35-40- retrieving a data mapping model used for a dataset reads on the data collection) comparing the data values of the field to the list of values; determining, in response to the comparing, (Walter: Col. 11 lines 43-46 Mapping reads on comparing)  that a threshold number of the data values match the values of the list; and in response to the determining, labeling the field with a particular label that specifies the data collection. (Walter: Col. 11 lines 43-46- thus “…the statistical similarity metric with one of the received datasets that meets a threshold criterion”- this means the mapping module maps/compares received dataset to a threshold value to cluster the dataset in that group or assign that label) 


Claim 26, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes generating at least two labels for the field; (Walter: Col. 8 lines 40-42- thus a label can have at least actual data, synthetic data field and relevant data field so at least two fields can be generated from the labels)  and determining whether the at least two labels are exclusive or inclusive of one another. (Walter: Fig. 4 element 430 shows that one dataset can have plurality of relationships meaning one dataset can be inclusive). 

Claim 27, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes: determining that the field includes a primary key for a data set of the one or more data sets; (Walter: Col. 3 lines 10-14- “foreign Key”) and selecting a label proposal test of the plurality of label proposal tests that are related to the primary key. (Walter: Col. 11 lines 7-10- thus labels associated with the candidate foreign key is selected for the dataset)

Claim 28, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes performing a metadata (Walter: Col. 4 lines 26-28 “metadata”) comparison of data values of the field to terms in a glossary of terms. (Walter: Col. 3 lines 35-36 data mapping reads on the metadata comparison, regarding glossary of terms, the directory disclosed in Col. 4 lines 26-28 reads on it.) 

Claim 29, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes: determining, from the data profile, a pattern represented by the data values stored of the field; (Walter: Col. 11 lines 25-27- thus datasets stored in clustered dataset reads on stored of the field)  determining a particular label that is mapped to the pattern; and labeling the field with the particular label. (Walter: Col. 11 lines 20-27 “Data mapping maps the received dataset  based on similarities such as parent-child relationships (pattern represented by data stored in the field))

Claim 30, Walters in view of Lorrain Hale discloses wherein applying the plurality of label proposal tests includes retrieving a list of values that are representative of a data collection; (Walter: Col. 11 lines 35-40- retrieving a data mapping model used for a dataset reads on the data collection)
 comparing the data values of the field to the list of values; determining, in response to the comparing (Walter: Col. 11 lines 43-46 Mapping reads on comparing)  that a threshold number of the data values match the values of the list; and in response to the determining, labeling the field with a particular label that specifies the data collection. (Walter: Col. 11 lines 43-46- thus “…the statistical similarity metric with one of the received datasets that meets a threshold criterion”- this means the mapping module maps/compares received dataset to a threshold value to cluster the dataset in that group or assign that label) 


Claim(s) 2,4,14,18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US10459954) in view of  Lorrain Hale (US20200301950) as applied to claims 1,3,5,6,7-13, 15-17 and 20-30  above, and further in view of Procops et al. (US 20140108357).
Claim 2, Walters in view of Lorrain Hale discloses wherein profiling the one or more data values of the field (Walters: Col.  9 lines 37-40- the data such as Social security are profiled) 
Walters in view of Lorrain does not discloses determining a format of a data value of the field. 
Procops discloses determining a format of a data value of the field. 
 (Section 0044- thus the format for the data value is that the data have to be an integer). 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of considering the format of a data value. The motivation is that the system is able to process the data. 
Claim 4, Walters in view of Lorrain Hale discloses wherein the statistical value (Walter: Col. 10 lines 19-21) 
Walters in view of Lorrain Hale does not disclose wherein comprises at least one of a minimum length of the data values of the field, a maximum length of the data values of the field, a most common data value of the field, a least common data value of the field, a maximum data value of the field, and a minimum data value of the field.
Procops discloses a maximum length of the data values of the field, (Procops: Section 0047-0048- thus the maximum length of the data should be as specified by the user) and this means the limitation because the claim requires at least one of a the listed items. 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of considering the format of a data value. The motivation is that the system is able to process the data. 

Claim 14, Walters in view of Lorrain Hale disclose the method further comprising retrieving from a data quality rules environment one or more data quality rules that are assigned to the label proposal specifying the semantic meaning (Walter: Col. 13 lines 51-53 actual data or synthetic data reads on the semantic meaning of the data) and 
Walters in view of Lorrain Hale does not disclose assigning a data quality rule of the one or more data quality rules to the field. 
Procops discloses assigning a data quality rule of the one or more data quality rules to the field. (Procops: Section 0078, lines 4-8- thus a list of fields has one or more validation rules which specifies which labels can be stored)  
Therefore it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to include the teaching of considering the rules to the field. The motivation is that the system is able to process the data. 

Claim 18, Walters in view of Lorrain Hale and further in view of  Procops discloses outputting the label proposals (Walters: Fig. 4) to a data quality rules environment. (Procops: Section 0056, lines 3-5- thus user-specified validation rules of the dataset) 
Claim 19, Walters in view of Lorrain Hale and further in view of  Procops discloses reducing based on the identified one of the label proposals a number of errors for processing data for the field using data quality rules (Procops: Section 0064, thus the validation rule checks the correctness of the dataset being entered into the field and thus reduce wrong dataset in the wrong data field) from the data quality environment relative to another number of errors for processing the data for the field without using the identified one of the label proposals. (Walter: Col. 6 lines 12-15- thus the classification error are measured) 
Cited Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Seigel et al. (US20170161503) discloses large computer system, maintaining informa­tion security is a difficult task as, in many cases, a security system may have difficulties distinguishing legitimate activities from the unauthorized access of data. Currently, a risk associated with a user account may be determined by looking at the resources to which the user account has access, groups to which the user account belongs, and resources which the user account owns.
Redlich et al. (US20150199405) discloses a method of organizing and processing data in a distributed computing system. The computing system has a plurality of select content data stores for respective ones of a plurality of enterprise designated categorical filters which include content-based filters, contex­tual filters and taxonomic classification filters, all operatively coupled over a communications network.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Akwasi M Sarpong whose telephone number is (571)270-3438. The examiner can normally be reached Mon-Fri. 8:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KING D POON can be reached on 571-272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AKWASI M SARPONG/           Primary  Examiner, Art Unit 2675 
11/01/2022.