DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.  The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
2. The Action is responsive to the Application filed June 3, 2019. 
3. Please note claims 1-27 are pending in which claims 1-8, 11-21 and 24-27 are rejected, claims 9-10 and 122-23 are objected to and claims 1, 14 and 27 are independent.
Claim Objections
4. Claim 26 is objected to because of the following informalities:  
As per claim 26, the claim recites “The system of claim 1, … “.
The “claim 1” seems to be a typographical error of “claim 14”.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
5.1. The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


5.1.1. Claims 8-9 and 21-22 rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, regards as the invention.
data description …”, “… the data description for the source data column” and “… a data description of the source data column …”, in succeeding. The antecedent basis for the data descriptions as recited seems ambiguous. 
5.2. The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

5.2.1. Claims 1, 14 and 17 are rejected under 35 U.S.C. 112(a) because the specification, while being enabling for storing columns in a table and loading table into database, does not reasonably provide enablement for the storing and the loading.  
As per the claims, the claims recite “store[ing] the plurality of source data columns in a data table generated to transform the data schema of the data file into the target data schema of the target database, wherein the source data column corresponds to the first target data column”; and “load[ing] the generated data table into the target database”. 
To an ordinary skilled in the art, data columns, data table and database are comprehended as data structures. Further per SQL ANSI, columns are defined as attributes in a CREATE TABLE SQL-statement, for example, CREATE TABLE USER (CITY CHAR(32)  …) specifies a column named CITY in the table USER.

Therefore, the specification does not enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to convey the invention commensurate in scope with these claims. 
Claim Rejections - 35 USC § 103
6. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37CPR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

6.1. Claims 1-4, 13-17 and 26-27 are rejected under 35 U.S.C. § 103 as being unpatentable over
Reynolds et al.: "COMPUTERIZED TOOL IMPLEMENTATION OF LAYERED DATA FILES TO DISCOVER, FORM, OR ANALYZE DATASET INTERRELATIONS OF NETWORKED COLLABORATIVE DATASETS", (U.S. Patent Application Publication US 20180262864 A1, ”), in view of
Crow et al.: "SYSTEM AND METHOD FOR USE IN TEXT ANALYSIS OF DOCUMENTS AND RECORDS", (U.S. Patent 6665661 B1, filed September 29, 2000 and issued December 16, 2003, hereafter "Crow”).

As per claim 1, Reynolds teaches a method for automatically ingesting data from disparate data sources having respective data schemas into a target database having a target data schema, comprising: 
receiving, from a user, a data file comprising a plurality of source data columns formatted according to a data schema of a data source and comprising a data dictionary comprising information describing the plurality of source data columns (See [0090] and [0098], dataset analyzer 630 classifies subsets of data (e.g., each subset of data as a column) in data file 601a as a particular data classification, such as a particular data type. For example, if a column of integers may be classified as "year data," if the integers are in one of a number of year formats expressed in accordance with a Gregorian calendar schema, if a column includes a number of cells that each include five digits, dataset analyzer 630 also may be configured to classify the digits as constituting a "zip code." Dataset analyzer 630 analyzes data file 601a to note the exceptions in the processing pipeline, and to append, embed, associate, or link user interface elements or features to one or more elements of data file 601a to facilitate collaborative user interface functionality (e.g., at a presentation layer) with respect to a user interface. Further, dataset analyzer 630 analyzes data file 601a relative to dataset-related data to determine correlations among dataset attributes of data file 601a and other datasets 603b (and attributes, such as metadata 603a). Once a subset of correlations has been determined, a dataset formatted in data file 601a (e.g., as an annotated tabular data file, or as a CSV file) may be enriched, for example, by associating links to the dataset of data file 601a to form the dataset of data file 601b, which, in some cases, may have a similar data format as data file 601a (e.g., with data enhancements, corrections, and/or enrichments); and dataset attribute manager receives correlated attributes derived from attribute correlator 663. In some cases, correlated attributes may relate to correlated dataset attributes based on data in data store 662 or based on data in data store 664. Here the dataset metadata, dataset data, schema data, ontology data and dataset attributes utilized by dataset analyzer and dataset attribute manager in combined is the dictionary for data analysis of the received file).
Reynolds does not explicitly teach generating a plurality of count data for each cell of a plurality of cells selected from a source data column of the plurality of source data columns, each count datum comprising a number of occurrences of a characteristic detected in each cell.

It would have been obvious to one having ordinary skill in the art at the time of the applicant's application was filed to combine Crow's teaching with Reynolds reference because Crow is dedicated to analyzing data records for mining text information, Reynolds is dedicated to facilitating consolidation of one or more datasets by using configured computerized tools to discover, form, and analyze, and the combined teaching would have enabled Reynolds to utilize Crow’s teaching on analyzing data records to the word level to reducing efforts by data scientists and data practitioners in extracting, transforming, and loading data into data stores in a manner that serves their desired objectives.

selecting one or more target data columns from a plurality of target data columns specified in the target data schema as being semantically related to the source data column based on the plurality of count data for each cell, a column header of the source data column, and the data dictionary (See Reynolds: [0099], dataset 601a may be enriched with data extracted from ( or linked to) other datasets identified by (or sharing similar) dataset attributes, such as data representing a user account identifier, user characteristics, similarities to other datasets, one or more other user account identifiers that may be associated with a dataset, data-related activities associated with a dataset (e.g., identity of a user account identifier associated with creating, modifying, querying, etc. a particular dataset), as well as other attributes, such as a ''usage" or type of usage associated with a dataset; and Crow: col. 10, lines 21-23 and 27-33, when all the words in the current cell are processed, a complete list of FreqTerms has been created for that cell and the contents of each of the collected FreqTerms (a termID and a frequency-of-occurrence value) are appended to a temporary buffer, and when all the columns in the current row have been processed, this temporary buffer will contain an encoding of the contents of each of the current row's text cells, in left-to-right column order. The temporary buffer will later be written to the cell-to-term file. Here the word each is the characteristic detected in each cell); 

receiving an input from the user that selects a graphical representation corresponding to a first target data column from the one or more selected target data columns for the source data column (See Reynolds: Fig. 8 and [0104], a subset of data of the dataset is interpreted against subsets of data (e.g., columns of data) for one or more data classifications (e.g., datatypes) to infer or derive at least an inferred attribute for a subset of data (e.g., a column of data). In some examples, the subset of data may relate to a columnar representation of data in a tabular data format, or CSV file, with, for example, columns annotated. Annotations may include descriptions of a data type (e.g., string, numeric, categorical, etc.), a data classification (e.g., a location, such as a zip code, etc.), or any other data or metadata that may be used to locate in a search or to link with other datasets.); 
storing the plurality of source data columns in a data table generated to transform the data schema of the data file into the target data schema of the target database, wherein the source data column corresponds to the first target data column (See Reynolds: Fig. 16 and [0140] and [0142], the set of data may be transformed from a 
loading the generated data table into the target database (See Reynolds: [0182], "Add files" user input 2445 may be activated via pointer element 2447a to initiate adding (e.g., uploading) files to provide additional datasets or to correct data in set of data).

As per claim 2, Reynolds in view of Crow teaches the method of claim 1, wherein selecting one or more target data columns from the plurality of target data columns specified in the target data schema comprises: 

selecting a first set of data columns from the plurality of target data columns based on the one or more determined clusters, wherein the first set of data columns comprises the one or more target data columns (See Reynolds: Fig. 28 and [0192], data representing summary characteristic data for subsets of data may be presented in a user interface, an interactive overlay window to convey interactive summary characteristics for a column of data. At 2808, an interactive overlay window may be configured to include aggregated data attributes (e.g., an aggregation or collection of summary 

As per claim 3, Reynolds in view of Crow teaches the method of claim 2, comprising: 
generating a second plurality of count data for each cell of a plurality of cells selected from each target data column of the plurality of target data columns of the target database (See Crow: col. 11, lines 11-17, the term-to-cell files (block 350) are ready to be created. Recall that, while the cell-to-term files were being created, the count of the number of cells where each term was found were accumulated and stored in a "cell counts" array. Here a second of the cell-to-term files created teaches a second plurality of count data for each cell of a plurality of cells selected); 
clustering the plurality of cells for each target data column into the plurality of clusters based on the second plurality of count data for each cell (See Crow: col. 11, lines 11-17, the term-to-cell files (block 350) are ready to be created. Recall that, while the cell-to-term files were being created, the count of the number of cells where each term was found were accumulated and stored in a "cell counts" array. Here the array teaches clustering); and 
associating a set of target data columns from the plurality of target data columns with each cluster based on a number of cells from the set of target data columns being 

As per claim 4, Reynolds in view of Crow teaches the method of claim 2, wherein determining the one or more clusters comprises: 
executing a machine learning algorithm configured to assign a cluster from the plurality of clusters to each cell based on the plurality of count data for each cell (See Reynolds: [0096], attribute correlator 663 may be configured to detect patterns or classifications among datasets and other data through the use of Bayesian networks, clustering analysis, as well as other known machine learning techniques or deep-learning techniques (e.g., including any known artificial intelligence techniques).).

As per claim 13, Reynolds in view of Crow teaches the method of claim 1, wherein displaying the one or more graphical representations corresponding to the one or more selected target data columns comprises: 
displaying a graphical icon next to the first target data column from the one or more selected target data columns indicating that the first target data column was 

As per claims 14-17 and 26, the claims recite a system for automatically ingesting data from disparate data sources having respective data schemas into a target database having a target data schema, comprising one or more processors (See Reynolds: [0057], subsets of executable code and, optionally, one or more processors for performing any number of functions by executing the executable code); and memory storing one or more programs  (See Reynolds: [0217], Instructions may be embedded in software or firmware. The term "computer readable medium" refers to any tangible medium that participates in providing instructions to processor 3404 for execution) that when executed by the one or more processors cause the one or more processors to perform the steps of the methods recited in claims 1-4 and 13 and rejected above, respectively, under 35 U.S.C. § 103 as being unpatentable over Reynolds in view of Crow.
Therefore, claims 14-17 and 26 are rejected along the same rationale that rejected claims 1-4 and 13, respectively.

As per claim 27, the claim recites a non-transitory computer-readable storage medium comprising instructions for ingesting data from disparate data sources having respective data schemas into a target database having a target data schema, wherein the instructions, when executed by one or more processors (See Reynolds: [0057] and [0217], subsets of executable code and, optionally, one or more processors for performing any number of functions by executing the executable code, and Instructions may be embedded in software or firmware. The term "computer readable medium" refers to any tangible medium that participates in providing instructions to processor 3404 for execution), cause the one or more processors to perform instructions comprising the steps of the methods recited in claim 1 and rejected above, under 35 U.S.C. § 103 as being unpatentable over Reynolds in view of Crow.
Therefore, claim 27 is rejected along the same rationale that rejected claim 1.

6.2. Claims 5-8 and 17-21 are rejected under 35 U.S.C. § 103 as being unpatentable over
Reynolds in view of Crow, as applied to claims 1-4, 13-17 and 26-27 above, and further in view of
JAGOTA: "SYSTEM AND METHOD FOR MAPPING SOURCE COLUMNS TO TARGET COLUMNS", (U.S. Patent Application Publication US 20130297661 A1, filed February 21, 2013 and published November 7, 2013).

As per claim 5, Reynolds in view of Crow teaches the method of claim 2, wherein selecting one or more target data columns from the plurality of target data columns specified in the target data schema comprises: 
selecting a second set of data columns from the first set of data columns, wherein the second set of data columns comprises the one or more target data columns (See Reynolds: [0097], inference engine 632 may receive data (e.g., enrichment data 607b) from a dataset attribute manager 661, where enrichment data 607b may include derived data or link-related data to form collaborative datasets. Consider that attribute correlator 663 can detect patterns in datasets in repositories 640a to 640c, among other sources of data, whereby the patterns identify or correlate to a subset of relevant datasets that may be linked with the dataset in data 601a.).
However, Reynolds in view of Crow does not explicitly teach the selecting the second set of data columns based on header comparisons between the source data column and each target data column of the first set of data columns.
On the other hand, as an analog art on dataset ingestion, JAGOTA teaches selecting the data columns based on header comparisons between the source data 
It would have been obvious to one having ordinary skill in the art at the time of the applicant's application was filed to combine JAGOTA's teaching with Reynolds in view of Crow reference because JAGOTA is dedicated to evaluating character strings to application for mapping columns from a source file to a target file, Crow is dedicated to analyzing data records for mining text information, Reynolds is dedicated to facilitating consolidation of one or more datasets by using configured computerized tools to discover, form, and analyze, and the combined teaching would have enabled Reynolds in view of Crow to utilize JAGOTA’s teaching on analyzing data records to the word level more accurately because of correlating columns in a source file with defined entities of the database model to transform and import the source data into the target.

As per claim 6, Reynolds in view of Crow and further in view of JAGOTA teaches the method of claim 5, wherein selecting the second set of data columns comprises: 
for each header comparison between the source data column and each target data column of the first set of data columns, determining a number of string operations 
selecting the second set of data columns from the first set of data columns based on the number of string operations determined for each target data column of the first set of data columns (See Reynolds: [0097], inference engine 632 may receive data (e.g., enrichment data 607b) from a dataset attribute manager 661, where enrichment data 607b may include derived data or link-related data to form collaborative datasets. Consider that attribute correlator 663 can detect patterns in datasets in repositories 640a to 640c, among other sources of data, whereby the patterns identify or correlate to a subset of relevant datasets that may be linked with the dataset in data 601a.).).

As per claim 7, Reynolds in view of Crow and further in view of JAGOTA teaches the method of claim 6, wherein each of the string operations comprises deleting a character, adding a character, or substituting a character (See JAGOTA: [0074] In step 402, an input character string is received).

As per claim 8, Reynolds in view of Crow and further in view of JAGOTA teaches the method of claim 5, wherein selecting one or more target data columns from the plurality of target data columns specified in the target data schema comprises: 
selecting a third set of data columns from the second set of data columns based on data description comparisons between the source data column and each target data column of the second set of data columns, wherein the third set of data columns comprises the one or more target data columns, and wherein the data dictionary comprises the data description for the source data column (See Reynolds: [0097] and [0103], inference engine 632 may receive data (e.g., enrichment data 607b) from a dataset attribute manager 661, where enrichment data 607b may include derived data or link-related data to form collaborative datasets. Consider that attribute correlator 663 can detect patterns in datasets in repositories 640a to 640c, among other sources of data, whereby the patterns identify or correlate to a subset of relevant datasets that may be linked with the dataset in data 601a.); and determining a annotative description of data for a column)).

As per claims 18-21, the claims recite a system for automatically ingesting data from disparate data sources having respective data schemas into a target database having a target data schema, comprising one or more processors (See Reynolds: [0057], 
Therefore, claims 18-21 are rejected along the same rationale that rejected claims 5-8, respectively.

6.3. Claims 11-12 and 24-25 are rejected under 35 U.S.C. § 103 as being unpatentable over
Reynolds in view of Crow, as applied to claims 1-4, 13-17 and 26-27 above, and further in view of
KIM et al.: "SYSTEM AND METHOD FOR MAPPING SOURCE COLUMNS TO TARGET COLUMNS", (U.S. Patent Application Publication US 20130297661 A1, filed February 21, 2013 and published November 7, 2013, hereafter “KIM”).


However, KIM teaches randomly selecting a predetermined number of cells from the source data column, wherein the predetermined number of cells corresponds to the plurality of selected cells (See [0013] and [0065], the controller may select the first memory blocks by randomly selecting a predetermined number of word lines coupled to the plurality of memory blocks and by selecting as the first memory blocks one or more memory blocks each including one or more of the selected word lines among the plurality of memory blocks. The cell string 340 of each column may include one or more drain select transistors and one or more source select transistors).
It would have been obvious to one having ordinary skill in the art at the time of the applicant's application was filed to combine KIM's teaching with Reynolds in view of Crow reference because KIM is dedicated to performing a read reclaim operation and adjusting a read reclaim count value,  Crow is dedicated to analyzing data records for mining text information, Reynolds is dedicated to facilitating consolidation of one or more datasets by using configured computerized tools to discover, form, and analyze, and the combined teaching would have enabled Reynolds in view of Crow to utilize KIM’s teaching on analyzing data records to the word level more efficiently because of 

As per claim 12, Reynolds in view of Crow and further in view of KIM teaches the method of claim 1, wherein the number of occurrences of a characteristic detected in each cell comprises: 
a number of alphabetical characters in the cell (See KIM: [0082], randomly select a predetermined number of word lines and select one or more memory blocks each including one or more of the selected word lines among the memory blocks), a number of digits in the cell, a number of white spaces in the cell, a number of special characters in the cell, a number of total characters in the cell, a number of people names identified in the cell, a number of location names identified in the cell, a number of nouns identified in the cell, or a number of verbs identified in the cell.

As per claims 24-25, the claims recite a system for automatically ingesting data from disparate data sources having respective data schemas into a target database having a target data schema, comprising one or more processors (See Reynolds: [0057], subsets of executable code and, optionally, one or more processors for performing any number of functions by executing the executable code); and memory storing one or more programs  (See Reynolds: [0217], Instructions may be embedded in software or 
Therefore, claims 24-25 are rejected along the same rationale that rejected claims 11-12, respectively.
Allowable Subject Matter
7. Claims 9-10 and 22-23 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
 
References
8.1. The prior art made of record: 
   A. U.S. Patent Application Publication US-20180262864-A1
   B. U.S. Patent US-6665661-B1.
   C. U.S. Patent Application Publication US-20130297661-A1.
   D. U.S. Patent Application Publication US-20180181326-A1.
  8.2. The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure. 
   E. U.S. Patent Application Publication US-20030028418-A1.
   U. Oracle7 Server Utilities, Release 7.3 February 1996 
Conclusion
9.1. Examiner has cited particular columns and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. SEE MPEP 2141.02 [R-5] VI. PRIOR ART MUST BE CONSIDERED IN ITS ENTIRETY, INCLUDING DISCLOSURES THAT TEACH AWAY FROM THE CLAIMS: A prior art reference must be considered in its entirety, i.e., as a whole, including portions that would lead away from the claimed invention. W.L. Gore & Associates, Inc. v. Garlock, Inc., 721 F.2d 1540, 220 USPQ 303 (Fed. Cir. 1983), cert. denied, 469 U.S. 851 (1984) In re Fulton, 391 F.3d 1195, 1201, 73 USPQ2d 1141, 1146 (Fed. Cir. 2004). >See also MPEP §2123. 
9.2. In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. 
Contact Information
10. Any inquiry concerning this communication or earlier communications from the Examiner should be directed to KUEN S LU whose telephone number is (571)272-4114. The examiner can normally be reached on M-F, 8-19, Mid-Flex 2 hrs.
If attempts to reach the examiner by telephone pre unsuccessful, the examiner's Supervisor, Mrs. Tamara T Kyle can be reached on 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for Page 13 Published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http: “//pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system; contact the Electronic Business Center (EBC) at 866-217-9197 (toll free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, please call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

KUEN S LU /Kuen S Lu/
Patent Examiner
 
Art Unit 2156 
March 18, 2021