DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Williamson (US 20180232528 A1, hereinafter Williamson).  

Regarding Claim 1, Williamson discloses an apparatus comprising at least one processor and at least one non-transitory memory comprising program code, wherein the at least one non-transitory memory and the program code are configured to, with the at least one processor, cause the apparatus to at least ([0111]: FIG. 13; The computer system 1300 can be used to execute instructions 1324 (e.g., program code or software) for causing the machine to perform any one or more of the methodologies (or processes) described herein): 
retrieve a first plurality of data objects associated with a first database schema from a database (Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N; [0025]: Examples of input data sources 102A-N include relational databases; [0050]: the metadata includes… schema names, database names. See also Fig. 2, Metadata (Schema) Analyzer 202, para [0059]); 
determine, based at least on the first plurality of data objects, a first data classifier corresponding to the first database schema (Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N and pre-process the data for use by the data classifier 108); 
generate a mapping specification based at least in part on the first data classifier and the first plurality of data objects (Fig. 1; [0027]: The data pre-processor 106 may determine the relationship between the input data sources 102A-N using various rules), 
wherein the mapping specification is configured to convert the first plurality of data objects associated with the first database schema to a second plurality of data objects associated with a second database schema (Fig. 1; [0026]: In another embodiment, the data pre-processor 106 converts the data in the input data sources 102A-N to a common data structure that may be parsed by the data classifier 108… This common data structure, may be, for example, a graph structure that indicates the relationship between various data elements from the input data (via connecting edges in the graph), and presents the input data in a single format, such as text [the common data structure corresponds to the second database schema or structure that defines how the data is organized]. See also para [0027]-[0028]); and 
generate the second plurality of data objects based at least in part on the first plurality of data objects and the mapping specification (Fig. 1; [0027]- [0028]: The data pre-processor 106 may place each converted data element of the input data sources 102A-N in its own unit in the common data structure… The data pre-processor 106 may further combine data from different input data sources 102A-N together into the same common data structure… Furthermore, the data pre-processor 106 may combine data associated with a single metadata label into a single unit, e.g., all data under a single structure, in the common data structure [data in the common data structure corresponds to the second plurality of data objects]).

Regarding Claim 2, Williamson discloses the apparatus of claim 1, wherein the first plurality of data objects comprise a first data table, wherein the first data table comprises at least one data field ([0029]: Each subsection may include groups of data portions that are categorized together, such as within a single file, data instance, container, or database table; [0031]: Instead, the data classifier 108 may assume that a unit of data that is separated from other data using a delimiter, such as a cell boundary in a table… is a data portion. Thus, a data portion may be a cell in a database, table, or other data structure, a table in a database. See also para [0100]).

Regarding Claim 3, Williamson discloses the apparatus of claim 2, wherein the first data table comprises at least one of name metadata, column metadata, or row metadata ([0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on).

Regarding Claim 4, Williamson discloses the apparatus of claim 3, wherein, when determining the first data classifier corresponding to the first database schema, the at least one non-transitory memory and the program code are configured to, with the at least one processor, cause the apparatus to: 
retrieve at least one of the name metadata, the column metadata, or the row metadata associated with the first plurality of data objects (Fig. 2; [0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on); and 
determine the first data classifier based further on at least one of the name metadata, the column metadata, or the row metadata (Fig. 2; [0050]: The metadata analyzer 202 may determine that a data portion is sensitive data if it indicates one or more sensitive data types, including but not limited to 1) names. See also para [0040]- [0044]).

Regarding Claim 5, Williamson discloses the apparatus of claim 2, wherein the first plurality of data objects comprise a second data table, wherein the at least one non-transitory memory and the program code are configured to, with the at least one processor, cause the apparatus to: 
determine correlation metadata associated with the first data table and the second data table ([0028]: In one embodiment, the data pre-processor 106 combines data having the same metadata labels. This may be achieved using a matching algorithm that matches metadata labels having the exact same label or labels that are within a threshold degree of similarity (e.g., labels that match if non-alphanumeric characters are removed)… For example, a column in a database may have a metadata label indicating the data in the column are credit card numbers); and 
determine the first data classifier based further on the correlation metadata ([0041]: Logical rules may be modified by the classifier refinement engine 110 based on configuration information provided by a user… Reference tables may be updated using newly received reference data. Contextual matching may also be updated based on new indications of contextual data).

Regarding Claim 6, Williamson discloses the apparatus of claim 2, wherein, when determining the first data classifier corresponding to the first database schema, the at least one non-transitory memory and the program code are configured to, with the at least one processor, cause the apparatus to: 
determine domain metadata associated with the first data table ([0054]: The pattern matcher 206 matches data portions received in the data from the data pre-processor 106 with various patterns… Emails may be matched by a string of characters (the local-part), followed by the “@” symbol, followed by an alphanumeric string of characters (the domain) that may include a period, and then ending with a period and sequence of characters that matches a top level domain (e.g., “.com,” “.mail”)); and 
determine the first data classifier based further on the domain metadata ([0055]: The rules may fall into various categories… Another rule may indicate that the data portion must pass some sort of verification test, such as Luhn check for credit card numbers, or having an email domain be checked to see if it is a valid domain).

Regarding Claim 7, Williamson discloses the apparatus of claim 1, wherein, prior to generating the mapping specification, the at least one non-transitory memory and the program code are configured to, with the at least one processor, cause the apparatus to further:
calculate a confidence score associated with the first data classifier ([0069]: The confidence value calculator 218 determines a confidence value for each data portion that is scanned… This means that the data portion is first checked to see if it is of a right type and format for the data classifier 108 component to process. In such a case, a data portion that is processed by a data classifier 108 component is given an initial confidence value); and 
determine whether the confidence score satisfies a predetermined threshold ([0038]: In one embodiment, the data classifier 108 further determines the security level of data portions determined to be sensitive (or likely to be sensitive and having a confidence value beyond a certain threshold). See also para [0070], [0072], [0102]). 

Regarding Claim 8, Williamson discloses a computer-implemented method, comprising:
 retrieving a first plurality of data objects associated with a first database schema from a database ((Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N; [0025]: Examples of input data sources 102A-N include relational databases; [0050]: the metadata includes… schema names, database names. See also Fig. 2, Metadata (Schema) Analyzer 202, para [0059]); 
determining, based at least on the first plurality of data objects, a first data classifier corresponding to the first database schema (Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N and pre-process the data for use by the data classifier 108); 
generating a mapping specification based at least in part on the first data classifier and the first plurality of data objects ([0027]: The data pre-processor 106 may determine the relationship between the input data sources 102A-N using various rules), 
wherein the mapping specification is configured to convert the first plurality of data objects associated with the first database schema to a second plurality of data objects associated with a second database schema ([0026]: In another embodiment, the data pre-processor 106 converts the data in the input data sources 102A-N to a common data structure that may be parsed by the data classifier 108… This common data structure, may be, for example, a graph structure that indicates the relationship between various data elements from the input data (via connecting edges in the graph), and presents the input data in a single format, such as text [the common data structure corresponds to the second database schema or structure that defines how the data is organized]. See also para [0027]-[0028]); and 
generating the second plurality of data objects based at least in part on the first plurality of data objects and the mapping specification ([0027]-[0028]: The data pre-processor 106 may place each converted data element of the input data sources 102A-N in its own unit in the common data structure… The data pre-processor 106 may further combine data from different input data sources 102A-N together into the same common data structure… Furthermore, the data pre-processor 106 may combine data associated with a single metadata label into a single unit, e.g., all data under a single structure, in the common data structure [data in the common data structure corresponds to the second plurality of data objects]).

Regarding Claim 9, Williamson discloses the computer-implemented method of claim 8, wherein the first plurality of data objects comprise a first data table, wherein the first data table comprises at least one data field ([0029]: Each subsection may include groups of data portions that are categorized together, such as within a single file, data instance, container, or database table; [0031]: Instead, the data classifier 108 may assume that a unit of data that is separated from other data using a delimiter, such as a cell boundary in a table… is a data portion. Thus, a data portion may be a cell in a database, table, or other data structure, a table in a database. See para [0100]).

Regarding Claim 10, Williamson discloses the computer-implemented method of claim 9, wherein the first data table comprises at least one of name metadata, column metadata, or row metadata ([0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on)).

Regarding Claim 11, Williamson discloses the computer-implemented method of claim 10, wherein determining the first data classifier corresponding to the first database schema further comprises: 
retrieving at least one of the name metadata, the column metadata, or the row metadata associated with the first plurality of data objects ([0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on); and 
determining the first data classifier based further on at least one of the name metadata, the column metadata, or the row metadata (Fig. 1; [0050]: The metadata analyzer 202 may determine that a data portion is sensitive data if it indicates one or more sensitive data types, including but not limited to 1) names. See para [0040]- [0044]).

Regarding Claim 12, Williamson discloses the computer-implemented method of claim 9, wherein the first plurality of data objects comprise a second data table, wherein the computer-implemented method further comprises: 
determining correlation metadata associated with the first data table and the second data table ([0028]: In one embodiment, the data pre-processor 106 combines data having the same metadata labels. This may be achieved using a matching algorithm that matches metadata labels having the exact same label or labels that are within a threshold degree of similarity (e.g., labels that match if non-alphanumeric characters are removed)… For example, a column in a database may have a metadata label indicating the data in the column are credit card numbers); and 
determining the first data classifier based further on the correlation metadata ([0041]: Logical rules may be modified by the classifier refinement engine 110 based on configuration information provided by a user… Reference tables may be updated using newly received reference data. Contextual matching may also be updated based on new indications of contextual data).

Regarding Claim 13, Williamson discloses the computer-implemented method of claim 9, further comprising: 
determining domain metadata associated with the first data table ([0054]: The pattern matcher 206 matches data portions received in the data from the data pre-processor 106 with various patterns… Emails may be matched by a string of characters (the local-part), followed by the “@” symbol, followed by an alphanumeric string of characters (the domain) that may include a period, and then ending with a period and sequence of characters that matches a top level domain (e.g., “.com,” “.mail”)); and 
determining the first data classifier based further on the domain metadata ([0055]: The rules may fall into various categories… Another rule may indicate that the data portion must pass some sort of verification test, such as Luhn check for credit card numbers, or having an email domain be checked to see if it is a valid domain).

Regarding Claim 14, Williamson discloses the computer-implemented method of claim 8, further comprising: 
calculating a confidence score associated with the first data classifier ([0069]: The confidence value calculator 218 determines a confidence value for each data portion that is scanned… This means that the data portion is first checked to see if it is of a right type and format for the data classifier 108 component to process. In such a case, a data portion that is processed by a data classifier 108 component is given an initial confidence value); and 
determining whether the confidence score satisfies a predetermined threshold ([0038]: In one embodiment, the data classifier 108 further determines the security level of data portions determined to be sensitive (or likely to be sensitive and having a confidence value beyond a certain threshold). See also para [0070], [0072], [0102]). 

Regarding Claim 15, Williamson discloses a computer program product comprising at least one non-transitory computer- readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising an executable portion ([0111]: FIG. 13; The computer system 1300 can be used to execute instructions 1324 (e.g., program code or software) for causing the machine to perform any one or more of the methodologies (or processes) described herein) configured to: 
retrieve a first plurality of data objects associated with a first database schema from a database (Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N; [0025]: Examples of input data sources 102A-N include relational databases; [0050]: the metadata includes… schema names, database names. See also Fig. 2, Metadata (Schema) Analyzer 202, para [0059]); 
determine, based at least on the first plurality of data objects, a first data classifier corresponding to the first database schema (Fig. 1; [0026]: The data pre-processor 106 receives data from the input data sources 102A-N and pre-process the data for use by the data classifier 108); 
generate a mapping specification based at least in part on the first data classifier and the first plurality of data objects ([0027]: The data pre-processor 106 may determine the relationship between the input data sources 102A-N using various rules), 
wherein the mapping specification is configured to convert the first plurality of data objects associated with the first database schema to a second plurality of data objects associated with a second database schema ([0026]: In another embodiment, the data pre-processor 106 converts the data in the input data sources 102A-N to a common data structure that may be parsed by the data classifier 108… This common data structure, may be, for example, a graph structure that indicates the relationship between various data elements from the input data (via connecting edges in the graph), and presents the input data in a single format, such as text [the common data structure corresponds to the second database schema or structure that defines how the data is organized]. See also para [0027]-[0028]); and 
generate the second plurality of data objects based at least in part on the first plurality of data objects and the mapping specification ([0027]-[0028]: The data pre-processor 106 may place each converted data element of the input data sources 102A-N in its own unit in the common data structure… The data pre-processor 106 may further combine data from different input data sources 102A-N together into the same common data structure… Furthermore, the data pre-processor 106 may combine data associated with a single metadata label into a single unit, e.g., all data under a single structure, in the common data structure [data in the common data structure corresponds to the second plurality of data objects]).

Regarding Claim 16, Williamson discloses the computer program product of claim 15, wherein the first plurality of data objects comprise a first data table, wherein the first plurality of data objects comprise a first data table, wherein the first data table comprises at least one data field ([0029]: Each subsection may include groups of data portions that are categorized together, such as within a single file, data instance, container, or database table; [0031]: Instead, the data classifier 108 may assume that a unit of data that is separated from other data using a delimiter, such as a cell boundary in a table… is a data portion. Thus, a data portion may be a cell in a database, table, or other data structure, a table in a database. See para [0100]).

Regarding Claim 17, Williamson discloses the computer program product of claim 16, wherein the first data table comprises at least one of name metadata, column metadata, or row metadata ([0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on).

Regarding Claim 18, Williamson discloses the computer program product of claim 17, wherein, when determining the first data classifier corresponding to the first database schema, the executable portion is configured to: 
retrieve at least one of the name metadata, the column metadata, or the row metadata associated with the first plurality of data objects ([0050]: The metadata analyzer 202 analyzes the metadata of a data portion in the data received from the data of the input data sources 102A-N… the metadata includes the metadata labels directly extracted from the input data sources 102A-N. This includes column labels, schema names, database names, tables names, XML tags, filenames, file headers, other tags, file metadata, and so on); and 
determine the first data classifier based further on at least one of the name metadata, the column metadata, or the row metadata ([Fig. 1; [0050]: The metadata analyzer 202 may determine that a data portion is sensitive data if it indicates one or more sensitive data types, including but not limited to 1) names. See para [0040]- [0044]).

Regarding Claim 19, Williamson discloses the computer program product of claim 16, wherein the first plurality of data objects comprise a second data table, wherein the executable portion is configured to: 
determine correlation metadata associated with the first data table and the second data table ([0028]: In one embodiment, the data pre-processor 106 combines data having the same metadata labels. This may be achieved using a matching algorithm that matches metadata labels having the exact same label or labels that are within a threshold degree of similarity (e.g., labels that match if non-alphanumeric characters are removed)… For example, a column in a database may have a metadata label indicating the data in the column are credit card numbers); and 
determine the first data classifier based further on the correlation metadata based further on at least one of the name metadata, the column metadata, or the row metadata ([0041]: Logical rules may be modified by the classifier refinement engine 110 based on configuration information provided by a user… Reference tables may be updated using newly received reference data. Contextual matching may also be updated based on new indications of contextual data).

Regarding Claim 20, Williamson discloses the computer program product of claim 16, wherein, when determining the first data classifier corresponding to the first database schema, the executable portion is configured to: 
determine domain metadata associated with the first data table ([0054]: The pattern matcher 206 matches data portions received in the data from the data pre-processor 106 with various patterns… Emails may be matched by a string of characters (the local-part), followed by the “@” symbol, followed by an alphanumeric string of characters (the domain) that may include a period, and then ending with a period and sequence of characters that matches a top level domain (e.g., “.com,” “.mail”)); and 
determine the first data classifier based further on the domain metadata ([0055]: The rules may fall into various categories… Another rule may indicate that the data portion must pass some sort of verification test, such as Luhn check for credit card numbers, or having an email domain be checked to see if it is a valid domain).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY D. HICKS whose telephone number is (571)272-3304.  The examiner can normally be reached on Mon - Fri 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.D.H./Examiner, Art Unit 2168    

/IRETE F EHICHIOYA/Supervisory Patent Examiner, Art Unit 2168