DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/04/22 has been entered.

Response to Amendment
The amendment filed on 10/04/22 has been entered. Claims 1, 3-6, 8, 10, 12-14, 17, 19-20 are pending in the application. It is acknowledged that claims 2, 11, 15 are cancelled, while claims 19-20 have been newly added.

Allowable over Prior Art
Regarding 35 USC 103, the prior art made of record neither renders obvious nor anticipates the combination of claimed elements, as recited in claims 1, 10. That is, the prior art fails to disclose “determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; and computing, based on the mode of said each candidate column delimiter, a total deviation value for said each candidate column delimiter of the plurality of candidate column delimiters; comparing the total deviation values of the plurality of candidate column delimiters to determine that a particular candidate column delimiter comprises a lowest total deviation of the plurality of candidate column delimiters and, in response, selecting the particular candidate column delimiter”. Further, the prior art made of record neither renders obvious nor anticipates the combination of claimed elements, as recited in claim 19. That is, the prior art fails to disclose “determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; receiving, from a user device, a second row delimiter for the data input file; using the second row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file”. Claims 3-6, 8, 12-14, 17, 20 are also allowable over the prior art due to their dependency on claims 1, 10. However, these claims would still need to be amended or cancelled in order to overcome the current double patenting, 35 USC 112, and 35 USC 101 rejections to put the claims into condition for allowance.

Double Patenting
	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1, 3-6, 8, 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-9 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928). Although the claims at issue are not identical, they are not patentably distinct from each other because it would be obvious to one of ordinary skill in the art that these claims in the present application are unpatentable over claims 1-9 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928).
Claim
US 16/748,351
Claim
US 10,204,119
1
A method comprising: receiving a data input file…;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file;











analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;













using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file;








storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; 
identifying, a plurality of candidate column delimiters, symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data;





for each candidate column delimiter of the plurality of candidate column delimiters: identifying a number of instances of said each candidate column delimiter of the plurality of candidate column delimiters in each row of the plurality of rows, identifying a mode of said each candidate column delimiter as a most frequent number of instances of said each candidate column delimiter in the plurality of rows, and computing, based on the mode of said each candidate column delimiter, a total deviation value for said each candidate column delimiter of the plurality of candidate column delimiters;

comparing the total deviation values of the plurality of candidate column delimiters to determine that a particular candidate column delimiter comprises a lowest total deviation of the plurality of candidate column delimiters and, in response, selecting the particular candidate column delimiter;




































using the particular candidate column delimiter and the row delimiter to generate a candidate schema for the data input file;


using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns;




wherein the method is performed using one or more processors
1-3, 8
A method comprising: receiving a data input file;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

[claim 2] “analyzing the sample excerpt to determine header data for the data input file, the header data comprising one or more strings in the data input file; wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data;” [claim 3] “wherein analyzing the sample excerpt to determine header data for the data input file comprises: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; based, at least in part, on determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of header data;”

analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;

analyzing the sample excerpt to determine a column delimiter for the data input file, the column delimiter comprising one or more symbols that delimit each particular column of a plurality of columns in the data input file;
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises:

[claim 2] “using the row delimiter, identifying a plurality of rows;
identifying, in the plurality of rows, one or more candidate column delimiters; … wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data”

[claim 8] “wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters in the sample excerpt; determining that a total deviation for the one or more particular column delimiters exceeds a stored deviation threshold and, in response, identifying, as at least one of the one or more candidate column delimiters, one or more symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data”

for each candidate column delimiter of the one or more candidate column delimiters:
identifying a number of instances of the candidate column delimiter in each the plurality of rows; determining a mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows; and computing a total deviation for the candidate column delimiter, the total deviation comprising a sum of deviations of the number of instances of the candidate column delimiter in each of the plurality of rows from the mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows;

determining that a particular candidate column delimiter comprises a lowest total deviation of the candidate column delimiters and, in response, selecting the particular candidate column delimiter (in light of [col. 9, lines 32-38] of the patent’s specification which utilize deviations from a mode to identify the candidate column delimiter and further recite an example involving comparison of deviations of delimiters and making a selection based on the comparison; further, no alternative methods or steps are disclosed for determining the lowest total deviation);

analyzing the sample excerpt to determine a plurality of data format types, each particular data format type corresponding to a particular column of each particular column of the plurality of columns in the data input file; wherein analyzing the sample excerpt to determine a plurality of data format types comprises: using the row delimiter, identifying a plurality of rows; using the column delimiter, identifying a plurality of columns; for each column of the plurality of columns performing: parsing data in the plurality of rows using a plurality of data formats; determining that data in one or more rows of the plurality of rows cannot be parsed with one or more first data formats of the plurality of data formats; identifying one or more candidate data formats for the column excluding the one or more first data formats; and selecting a second data format from the one or more candidate data formats;

using the column delimiter, row delimiter, and plurality of data format types to generate a candidate schema for the data input file;

using the candidate schema and the data input file, generating a plurality of sample rows and sample columns;

displaying the plurality of sample rows and sample columns through a graphical user interface; wherein the method is performed using one or more processors
3
further comprising using the header data for the data input file, extracting one or more column names for the plurality of columns
4
further comprising using the header data, extracting one or more column names for the plurality of columns
4


wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: storing row delimiter whitelist data comprising a plurality of candidate row delimiters;

searching the sample excerpt to locate a particular candidate row delimiter, wherein the particular candidate row delimiter is a first occurrence of any of the plurality of candidate row delimiters;

selecting the particular candidate row delimiter as the row delimiter for the data input file
5
wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: storing row delimiter whitelist data comprising a plurality of candidate row delimiter; 


searching the sample excerpt to locate a particular row delimiter candidate, wherein the particular candidate row delimiter is a first occurrence of any of the plurality of candidate row delimiters;

selecting the particular row delimiter candidate as the row delimiter for the data input file
5
wherein identifying the one or more candidate column delimiters comprises: identifying the one or more candidate column delimiters in the sample excerpt
6
wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising the one or more candidate column delimiters; identifying the one more candidate column delimiters in the sample excerpt
6
wherein identifying the one or more candidate column delimiters comprises:








determining that the sample excerpt does not contain any of the plurality of particular candidate column delimiters;

identifying, as the one or more candidate column delimiters, one or more symbols in the sample excerpt that are not contained in the column delimiter blacklist data
7
wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters;
storing column delimiter black list data comprising data identifying one or more symbols that are not candidate column delimiters;

determining that the sample excerpt does not contain any of the plurality of particular candidate column delimiters;

identifying, as the one or more candidate column delimiters, one or more symbols in the sample excerpt that are not contained in the column delimiter black list data
8
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises: identifying, in the sample excerpt, one or more symbols following an open quotation and preceding a close quotation;

identifying a particular symbol immediately following the close quotation;

selecting the particular symbol immediately following the close quotation as the column delimiter
9
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises: identifying, in the sample excerpt, one or more symbols following an open quotation and preceding a close quotation;

identifying a particular symbol immediately following the close quotation;

selecting the particular symbol as the column delimiter
20
further comprising: receiving, from a user device, a second row delimiter for the data input file; using the row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows;







































using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns; storing, in a database, the plurality of rows that contain a plurality of columns
2-3, 10-11
[claim 2] “analyzing the sample excerpt to determine header data for the data input file, the header data comprising one or more strings in the data input file; wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data;” [claim 3] “wherein analyzing the sample excerpt to determine header data for the data input file comprises: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; based, at least in part, on determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of header data;” [Claim 10] “further comprising: displaying with the plurality of sample rows and sample columns, data identifying the plurality of data format types, the row delimiter, and the column delimiter” [Claim 11] “further comprising: receiving, through the graphical user interface, input modifying one or more of the column delimiter, the row delimiter, or one or more of the plurality of data format types (receiving, from a user device, a second row delimiter); in response to the input, performing: analyzing the sample excerpt to determine a second column delimiter for the data input file; analyzing the sample excerpt to determine a second row delimiter for the data input file;
analyzing the sample excerpt to determine a second plurality of data format types; using the second column delimiter, second row delimiter, and second plurality of data format types to generate a second candidate schema for the data input file; using the second candidate schema and the data input file, generating a second plurality of sample rows and sample columns; displaying the second plurality of sample rows and sample columns through the graphical user interface;”


Regarding claim 1 of the present application, claim 1 of U.S. Patent No. 10,204,119 fails to disclose “…to be stored in a database, the data input file having unknown schema; storing, in a database, the plurality of rows that contain a plurality of columns”
However, Elmore teaches the following limitations, …to be stored in a database at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”),
the data input file having unknown schema at least by ([col. 4, lines 60-64] “FIG. 2, consisting of FIGS. 2A and 2B is a flowchart illustrating a method of identifying row delimiters, column delimiters, and string delimiters from a file in which the delimiters are unknown. The file is received and the first N bytes of the file are copied 210.” [col. 8, lines 62-67] “It is noted that there is no need for input about the file structure or origin used to attempt to identify delimiters according to the present invention”) and the delimiters (schema) as well as file structure are unknown when the file is received;
storing, in a database, the plurality of rows that contain a plurality of columns at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”) and the file storage (database) stores the input file as well as the parsed input file.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Elmore into claim 1 of U.S. Patent No. 10,204,119 because the references similarly disclose identifying and utilizing delimiters within files. Consequently, one of ordinary skill in the art would be motivated to further modify the claimed invention as in claim 1 of U.S. Patent No. 10,204,119 to further include the file stored to a database as well as the file having an unknown schema as in Elmore in order to have the ability to retrieve the files in the future and disambiguate files with an unknown schema.
Claims 3-6, 8, 20 depend on claim 1 and are, therefore, rejected for the same reasons as applied hereinabove.

Claims 10, 13-14, 17 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-3, 5-6, 8-9 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928). Although the claims at issue are not identical, they are not patentably distinct from each other because it would be obvious to one of ordinary skill in the art that these claims in the present application are unpatentable over claims 1-3, 5-6, 8-9 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928).
Claim
US 16/748,351
Claim
US 10,204,119
10
A system comprising: …cause performance of:

receiving a data input file…;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file;











analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;














using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file;








storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; 
identifying, a plurality of candidate column delimiters, symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data;




for each candidate column delimiter of the plurality of candidate column delimiters: identifying a number of instances of said each candidate column delimiter of the plurality of candidate column delimiters in each row of the plurality of rows, identifying a mode of said each candidate column delimiter as a most frequent number of instances of said each candidate column delimiter in the plurality of rows, and computing, based on the mode of said each candidate column delimiter, a total deviation value for said each candidate column delimiter of the plurality of candidate column delimiters;

comparing the total deviation values of the plurality of candidate column delimiters to determine that a particular candidate column delimiter comprises a lowest total deviation of the plurality of candidate column delimiters and, in response, selecting the particular candidate column delimiter;






































using the particular candidate column delimiter and the row delimiter to generate a candidate schema for the data input file;


using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns






1-3, 8
A method comprising:


receiving a data input file;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

[claim 2] “analyzing the sample excerpt to determine header data for the data input file, the header data comprising one or more strings in the data input file; wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data;” [claim 3] “wherein analyzing the sample excerpt to determine header data for the data input file comprises: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; based, at least in part, on determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of header data;”

analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;

analyzing the sample excerpt to determine a column delimiter for the data input file, the column delimiter comprising one or more symbols that delimit each particular column of a plurality of columns in the data input file;
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises:

[claim 2] “using the row delimiter, identifying a plurality of rows;
identifying, in the plurality of rows, one or more candidate column delimiters; … wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data”

[claim 8] “wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters in the sample excerpt; determining that a total deviation for the one or more particular column delimiters exceeds a stored deviation threshold and, in response, identifying, as at least one of the one or more candidate column delimiters, one or more symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data”

for each candidate column delimiter of the one or more candidate column delimiters:
identifying a number of instances of the candidate column delimiter in each the plurality of rows; determining a mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows; and computing a total deviation for the candidate column delimiter, the total deviation comprising a sum of deviations of the number of instances of the candidate column delimiter in each of the plurality of rows from the mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows;

determining that a particular candidate column delimiter comprises a lowest total deviation of the candidate column delimiters and, in response, selecting the particular candidate column delimiter (in light of [col. 9, lines 32-38] of the patent’s specification which utilize deviations from a mode to identify the candidate column delimiter and further recite an example involving comparison of deviations of delimiters and making a selection based on the comparison; further, no alternative methods or steps are disclosed for determining the lowest total deviation);

analyzing the sample excerpt to determine a plurality of data format types, each particular data format type corresponding to a particular column of each particular column of the plurality of columns in the data input file;
wherein analyzing the sample excerpt to determine a plurality of data format types comprises:
using the row delimiter, identifying a plurality of rows;
using the column delimiter, identifying a plurality of columns;
for each column of the plurality of columns performing:
parsing data in the plurality of rows using a plurality of data formats;
determining that data in one or more rows of the plurality of rows cannot be parsed with one or more first data formats of the plurality of data formats; identifying one or more candidate data formats for the column excluding the one or more first data formats; and
selecting a second data format from the one or more candidate data formats;

using the column delimiter, row delimiter, and plurality of data format types to generate a candidate schema for the data input file;

using the candidate schema and the data input file, generating a plurality of sample rows and sample columns;

displaying the plurality of sample rows and sample columns through a graphical user interface; wherein the method is performed using one or more processors
13
wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: storing row delimiter whitelist data comprising a plurality of candidate row delimiters;

searching the sample excerpt to locate a particular candidate row delimiter, wherein the particular candidate row delimiter is a first occurrence of any of the plurality of candidate row delimiters; 

selecting the particular candidate row delimiter as the row delimiter for the data input file
5
wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: storing row delimiter whitelist data comprising a plurality of candidate row delimiter; 


searching the sample excerpt to locate a particular row delimiter candidate, wherein the particular candidate row delimiter is a first occurrence of any of the plurality of candidate row delimiters;

selecting the particular row delimiter candidate as the row delimiter for the data input file
14
wherein identifying the one or more candidate column delimiters comprises: identifying the one or more candidate column delimiters in the sample excerpt
6
wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising the one or more candidate column delimiters; identifying the one more candidate column delimiters in the sample excerpt
17
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises: identifying, in the sample excerpt, one or more symbols following an open quotation and preceding a close quotation;

identifying a particular symbol immediately following the close quotation;

selecting the particular symbol immediately following the close quotation as the column delimiter
9
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises: identifying, in the sample excerpt, one or more symbols following an open quotation and preceding a close quotation;

identifying a particular symbol immediately following the close quotation;

selecting the particular symbol as the column delimiter


Regarding claim 10 of the present application, claim 1 of U.S. Patent No. 10,204,119 fails to disclose “one or more processors; one or more storage media; one or more instructions stored in the one or more storage media which, when executed by the one or more processors; …to be stored in a database, the data input file having unknown schema; storing, in a database, the plurality of rows that contain a plurality of columns”
However, Elmore teaches the following limitations, one or more processors; one or more storage media; one or more instructions stored in the one or more storage media which, when executed by the one or more processors at least by ([cols. 3-4, lines 65-5] “The present invention may be implemented as computer software on a conventional computer system. Referring now to FIG. 1, a conventional computer system 150 for practicing the present invention is shown. Processor 160 retrieves and executes software instructions stored in storage 162 such as memory, which may be Random Access Memory (RAM) and may control other components to perform the present invention.”);
…to be stored in a database at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”),
the data input file having unknown schema at least by ([col. 4, lines 60-64] “FIG. 2, consisting of FIGS. 2A and 2B is a flowchart illustrating a method of identifying row delimiters, column delimiters, and string delimiters from a file in which the delimiters are unknown. The file is received and the first N bytes of the file are copied 210.” [col. 8, lines 62-67] “It is noted that there is no need for input about the file structure or origin used to attempt to identify delimiters according to the present invention”) and the delimiters (schema) as well as file structure are unknown when the file is received;
storing, in a database, the plurality of rows that contain a plurality of columns at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”) and the file storage (database) stores the input file as well as the parsed input file.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Elmore into claim 1 of U.S. Patent No. 10,204,119 because the references similarly disclose identifying and utilizing delimiters within files. Consequently, one of ordinary skill in the art would be motivated to further modify the claimed invention as in claim 1 of U.S. Patent No. 10,204,119 to further include the file stored to a database as well as the file having an unknown schema as in Elmore in order to have the ability to retrieve the files in the future and disambiguate files with an unknown schema.
Claims 13-14, 17 depend on claim 10 and are, therefore, rejected for the same reasons as applied hereinabove.

Claim 12 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 4 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928). Although the claims at issue are not identical, they are not patentably distinct from each other because it would be obvious to one of ordinary skill in the art that these claims in the present application are unpatentable over claim 4 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928).
Claim
US 16/748,351
Claim
US 10,204,119
12
…further cause performance of, using the header data for the data input file, extracting one or more column names for the plurality of columns
4
further comprising using the header data, extracting one or more column names for the plurality of columns


Regarding claim 12 of the present application, claim 4 of U.S. Patent No. 10,204,119 fails to disclose “wherein the instructions, when executed by the one or more processors”
However, Elmore teaches the above limitation at least by ([cols. 3-4, lines 65-5] “The present invention may be implemented as computer software on a conventional computer system. Referring now to FIG. 1, a conventional computer system 150 for practicing the present invention is shown. Processor 160 retrieves and executes software instructions stored in storage 162 such as memory, which may be Random Access Memory (RAM) and may control other components to perform the present invention.”);
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Elmore into claim 4 of U.S. Patent No. 10,204,119 because the references similarly disclose identifying and utilizing delimiters within files. Consequently, one of ordinary skill in the art would be motivated to further modify the claimed invention as in claim 4 of U.S. Patent No. 10,204,119 to further include the necessary hardware as in Elmore in order to be able to perform the extraction automatically on a computer.

Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-3, 8, 10-11 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928). Although the claims at issue are not identical, they are not patentably distinct from each other because it would be obvious to one of ordinary skill in the art that this claim in the present application is unpatentable over claims 1-3, 8, 10-11 of U.S. Patent No. 10,204,119 in view of Elmore (US 9,753,928).
Claim
US 16/748,351
Claim
US 10,204,119
19
A method comprising:


receiving a data input file…;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file;











analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;














using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file;


storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; 
identifying, a plurality of candidate column delimiters, symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data;






















































receiving, from a user device, a second row delimiter for the data input file; using the second row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows;









































using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns;






























 
wherein the method is performed using one or more processors
1-3, 8, 10-11
A method comprising:


receiving a data input file;

selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file;

[claim 2] “analyzing the sample excerpt to determine header data for the data input file, the header data comprising one or more strings in the data input file; wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data;” [claim 3] “wherein analyzing the sample excerpt to determine header data for the data input file comprises: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; based, at least in part, on determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of header data;”

analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file;

analyzing the sample excerpt to determine a column delimiter for the data input file, the column delimiter comprising one or more symbols that delimit each particular column of a plurality of columns in the data input file;
wherein analyzing the sample excerpt to determine a column delimiter for the data input file comprises:

using the row delimiter, identifying a plurality of rows; using the column delimiter, identifying a plurality of columns (and claims 2-3 for “not included in header data…”;

[claim 8] “wherein identifying the one or more candidate column delimiters comprises: storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters in the sample excerpt; determining that a total deviation for the one or more particular column delimiters exceeds a stored deviation threshold and, in response, identifying, as at least one of the one or more candidate column delimiters, one or more symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data”

for each candidate column delimiter of the one or more candidate column delimiters:
identifying a number of instances of the candidate column delimiter in each the plurality of rows; determining a mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows; and computing a total deviation for the candidate column delimiter, the total deviation comprising a sum of deviations of the number of instances of the candidate column delimiter in each of the plurality of rows from the mode of the numbers of instances of the candidate column delimiter in each of the plurality of rows; determining that a particular candidate column delimiter comprises a lowest total deviation of the candidate column delimiters and, in response, selecting the particular candidate column delimiter (in light of [col. 9, lines 32-38] of the patent’s specification which utilize deviations from a mode to identify the candidate column delimiter and further recite an example involving comparison of deviations of delimiters and making a selection based on the comparison; further, no alternative methods or steps are disclosed for determining the lowest total deviation); analyzing the sample excerpt to determine a plurality of data format types, each particular data format type corresponding to a particular column of each particular column of the plurality of columns in the data input file; wherein analyzing the sample excerpt to determine a plurality of data format types comprises:


[claim 2] “analyzing the sample excerpt to determine header data for the data input file, the header data comprising one or more strings in the data input file; wherein analyzing the sample excerpt to determine the column delimiter, the row delimiter, and the plurality of data format types comprises analyzing only data in the sample excerpt that is not included in the header data;” [claim 3] “wherein analyzing the sample excerpt to determine header data for the data input file comprises: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; based, at least in part, on determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of header data;” [Claim 10] “further comprising: displaying with the plurality of sample rows and sample columns, data identifying the plurality of data format types, the row delimiter, and the column delimiter” [Claim 11] “further comprising: receiving, through the graphical user interface, input modifying one or more of the column delimiter, the row delimiter, or one or more of the plurality of data format types (receiving, from a user device, a second row delimiter); in response to the input, performing: analyzing the sample excerpt to determine a second column delimiter for the data input file; analyzing the sample excerpt to determine a second row delimiter for the data input file;
analyzing the sample excerpt to determine a second plurality of data format types; using the second column delimiter, second row delimiter, and second plurality of data format types to generate a second candidate schema for the data input file; using the second candidate schema and the data input file, generating a second plurality of sample rows and sample columns; displaying the second plurality of sample rows and sample columns through the graphical user interface;”

for each column of the plurality of columns performing: parsing data in the plurality of rows using a plurality of data formats; determining that data in one or more rows of the plurality of rows cannot be parsed with one or more first data formats of the plurality of data formats; identifying one or more candidate data formats for the column excluding the one or more first data formats; and
selecting a second data format from the one or more candidate data formats; using the column delimiter, row delimiter, and plurality of data format types to generate a candidate schema for the data input file; using the candidate schema and the data input file, generating a plurality of sample rows and sample columns;
displaying the plurality of sample rows and sample columns through a graphical user interface; wherein the method is performed using one or more processors


Regarding claim 19 of the present application, claim 1 of U.S. Patent No. 10,204,119 fails to disclose “…to be stored in a database, the data input file having unknown schema; storing, in a database, the plurality of rows that contain a plurality of columns”
However, Elmore teaches the following limitations, …to be stored in a database at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”),
the data input file having unknown schema at least by ([col. 4, lines 60-64] “FIG. 2, consisting of FIGS. 2A and 2B is a flowchart illustrating a method of identifying row delimiters, column delimiters, and string delimiters from a file in which the delimiters are unknown. The file is received and the first N bytes of the file are copied 210.” [col. 8, lines 62-67] “It is noted that there is no need for input about the file structure or origin used to attempt to identify delimiters according to the present invention”) and the delimiters (schema) as well as file structure are unknown when the file is received;
storing, in a database, the plurality of rows that contain a plurality of columns at least by ([col. 9, lines 8-10] “File receiver 310 receives the file as described above, copies the first number of bytes of the file, and stores such copy in file storage 398” [col. 10, lines 50-52] “The delimiters corresponding to the set of rows selected by the user are provided to parser 354, which parses the file and stores the parsed file into file storage 398”) and the file storage (database) stores the input file as well as the parsed input file.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Elmore into claim 1 of U.S. Patent No. 10,204,119 because the references similarly disclose identifying and utilizing delimiters within files. Consequently, one of ordinary skill in the art would be motivated to further modify the claimed invention as in claim 1 of U.S. Patent No. 10,204,119 to further include the file stored to a database as well as the file having an unknown schema as in Elmore in order to have the ability to retrieve the files in the future and disambiguate files with an unknown schema.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1, 3-6, 8, 10, 12-14, 17, 19-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claims 1, 10, 19, 20 recite "a database" in lines 46, 50, 37, 11. This portion of the limitation is unclear because, prior to this portion of the limitation, claims 1, 10, 19 also recite “a database” initially in the “receiving” limitation while claim 20 recites “a database” in the “storing” limitation. Therefore, it is not clear if the latter recitations of “a database” refer to the same database as initially recited or perhaps another database. For this reason, these claims fail to particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant (MPEP 2171). In an effort to practice compact prosecution, the examiner is interpreting each of these recitations of a database as referring to the same database. Claims 3-6, 8, 12-14, 17 are also rejected for the same reason due to their dependency on claims 1, 10;
Claims 1, 10, 19, 20 recite "the plurality of rows" in lines 25/32/34/48, 29/36/38/50, 25/37, 11 respectively. This portion of the limitation is unclear because, prior to this portion of the limitation, claims 1, 10, 19 recite “a plurality of rows in the data input file”, “a plurality of rows from the sample excerpt”, or “a plurality of rows and columns, while claim 20 recites “a second plurality of rows” and “a plurality of rows and columns”, as well as the recitations in claim 1, in which this claim depends upon. Therefore, it is not clear if “the plurality of rows” refers to “a plurality of rows in the data input file”, “a plurality of rows from the sample excerpt”, or “a plurality of rows and columns”, “a second plurality of rows” or perhaps other rows and columns. For this reason, these claims fail to particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant (MPEP 2171). In an effort to practice compact prosecution, the examiner is interpreting the first three recitations of the plurality of rows in claims 1, 10, 19 as referring to a plurality of rows from the sample excerpt and the last recitation of the plurality of rows as referring to a plurality of rows and columns. Further, the examiner is interpreting the plurality of rows as referring to a plurality of rows and columns as recited in claim 20. Lastly, the examiner is interpreting “a plurality of rows in the data input file”, “a plurality of rows from the sample excerpt”, “a plurality of rows and columns”, and “a second plurality of rows” as each referring to a separate set of rows. Claims 3-6, 8, 12-14, 17 are also rejected for the same reason due to their dependency on claims 1, 10;
Claims 1, 10, 19 recite "in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value" in lines 12, 16, 12, respectively. This portion of the limitation is unclear because, prior to this portion of the limitation, the claims recite the limitation “determining that a first row..”, which is not referred back to by the “in response to determining that the first row does not contain a delimited value” limitation. Therefore, it is not clear if the “in response to determining” limitation refers to the “determining that a first row..” limitation, or perhaps, is merely reciting an additional or separate determining step. Additionally, the “in response to determining” limitation recites “does not contain a delimited value”, however, the “determining that a first row..” limitation states “does not contain a delimited numeric value”. Therefore, it is not clear if the “delimited value” refers to the “delimited numeric value” as initially recited, or perhaps, a different or another delimited value. For these reasons, these claims fail to particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant (MPEP 2171). In an effort to practice compact prosecution, the examiner is interpreting the “in response to determining that the first row…” limitation as referring back to the “determining that a first row…” limitation and “the delimited value” as referring back to the “delimited numeric value”, respectively. Claims 3-6, 8, 12-14, 17, 20 are also rejected for the same reason due to their dependency on claims 1, 10;
Claim 20 recites “the candidate schema” in line 9. This portion of the limitation is unclear because, prior to this portion of the limitation, claim 1 also recites “a candidate schema” initially in the “using the particular candidate column” limitation. Therefore, it is not clear if “the candidate schema” refers to “a candidate schema” as initially recited in claim 1 which was generated using the particular candidate column delimiter and the row delimiter or if it refers to “a candidate schema” as recited in claim 20 which was generated using the second candidate column delimiter and the second row delimiter. For this reason, these claims fail to particularly point out and distinctly define the metes and bounds of the subject matter to be protected by the patent grant (MPEP 2171). In an effort to practice compact prosecution, the examiner is interpreting the recitation of a/the candidate schema in claim 1 as referring to a candidate schema as recited in claim 1 which was generated using the particular candidate column delimiter and the row delimiter and a/the candidate schema in claim 20 as referring to a candidate schema which was generating using the second candidate column delimiter and the second row delimiter.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 3-6, 8, 10, 12-14, 17, 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 10 similarly recite receiving a data input file to be stored in a database, the data input file having unknown schema; selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file; analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file; using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; identifying, as a plurality of candidate column delimiters, one or more symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data; for each candidate column delimiter of the plurality of candidate column delimiters: identifying a number of instances of said each candidate column delimiter of the plurality of candidate column delimiters in each row of the plurality of rows, identifying a mode of said each candidate column delimiter as a most frequent number of instances of said each candidate column delimiter in the plurality of rows, and computing, based on the mode of said each candidate column delimiter, a total deviation value for said each candidate column delimiter of the plurality of candidate column delimiters; comparing the total deviation values of the plurality of candidate column delimiters to determine that a particular candidate column delimiter comprises a lowest total deviation of the plurality of candidate column delimiters and, in response, selecting the particular candidate column delimiter; using the particular candidate column delimiter and the row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns; storing, in a database, the plurality of rows that contain a plurality of columns; wherein the method is performed using one or more processors.
The limitations of, selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file; analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file; using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; identifying, as a plurality of candidate column delimiters, one or more symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data; for each candidate column delimiter of the plurality of candidate column delimiters: identifying a number of instances of said each candidate column delimiter of the plurality of candidate column delimiters in each row of the plurality of rows; comparing the total deviation values of the plurality of candidate column delimiters to determine that a particular candidate column delimiter comprises a lowest total deviation of the plurality of candidate column delimiters and, in response, selecting the particular candidate column delimiter; using the particular candidate column delimiter and the row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns as drafted, are processes that, under their broadest reasonable interpretation, cover mental processes but from the recitation of implementing them on generic computer components. That is, nothing in the claim elements preclude the steps from practically being performed in the mind. For example, the “selecting” encompass the user observing, analyzing, and/or judging a subset of the sample excerpt. The limitations, “analyzing the sample excerpt”, “determining the header data”, “in response to determining” encompass the user analyzing and judging the sample excerpt in order to determine whether rows contain delimited values and judging that the row without a delimited value contains header data. The limitations, “analyzing the sample”, “identifying a plurality of rows”, “identifying one or more particular candidate column delimiters” encompass the user observing, analyzing, and/or judging the sample excerpt to determine row delimiters and using them to identify rows and candidate column delimiters for the rows. The limitation “identifying, as a plurality of candidate column delimiters, symbols in the sample excerpt” encompasses the user analyzing, and judging to determine if observed symbols are in a whitelist or blacklist. The limitations, “for each candidate column delimiter of the plurality of candidate column delimiters: identifying a number of instances of said each candidate column delimiter”, “comparing the total deviation values” and “selecting the particular candidate column delimiter” encompass the user observing and counting delimiters and comparing total deviations value, which is a simple comparison of numerical values and selecting the associated column delimiter based this determination. The limitation, “using the particular candidate column delimiter”, “using the candidate schema for the data input file” encompass the user observing, analyzing, and/or judging, perhaps with the aid of a pen and piece of paper, the sample excerpt to determine a layout of the data based on the row and column delimiters and applying the schema to the file data to translate the file data into rows and columns, which can also be written down on the piece of paper. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. The limitations of, identifying a mode of said each candidate column delimiter as a most frequent number of instances of said each candidate column delimiter in the plurality of rows, and computing, based on the mode of said each candidate column delimiter, a total deviation value for said each candidate column delimiter of the plurality of candidate column delimiters as drafted, are processes that, under their broadest reasonable interpretation, cover mathematical concepts but from the recitation of implementing them on generic computer components. That is, more specifically, these limitations recite pure mathematical calculations. Accordingly, claims 1, 10 recite multiple abstract ideas (Step 2A, Prong 1).
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of – receiving a data input file to be stored in a database, the data input file having unknown schema; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns; wherein the method is performed using one or more processors; a system comprising: one or more processors; one or more storage media; one or more instructions stored in the one or more storage media which, when executed by the one or more processors, cause performance of:. The one or more processors, one or more storage media, and database are recited at a high-level of generality (i.e., as generic computer devices performing generic computer functions) and do not meaningfully limit the claim. The additional elements receiving a data input file to be stored in a database, the data input file having unknown schema; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns represent insignificant extra-solution activities to the judicial exception and are mere data gathering steps. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2).
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of receiving a data input file; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns represent insignificant extra-solution activities that are well-understood, routine, and conventional activities previously known to the industry. That is, these limitations represent well-understood, routine, conventional activities in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. (Step 2B). Accordingly, claims 1, 10 are not patent eligible.
Independent claim 19 recites a method comprising: receiving a data input file to be stored in a database, the data input file having unknown schema; selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file; analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file; using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; identifying, as a plurality of candidate column delimiters, symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data; receiving, from a user device, a second row delimiter for the data input file; using the second row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows; using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns; storing, in a database, the plurality of rows that contain a plurality of columns ; wherein the method is performed using one or more processors.
The limitations of, selecting a sample excerpt from the data input file, the sample excerpt comprising a subset of the data input file; analyzing the sample excerpt to determine header data for the data input file, determining the header data for the data input file comprising: determining that a first row in the sample excerpt does not contain a delimited numeric value; determining that a second row in the sample excerpt following the first row does contain a delimited value; in response to determining that the first row does not contain a delimited value and the second row does contain a delimited value, determining that the first row consists of the header data for the data input file; analyzing the sample excerpt to determine a row delimiter for the data input file, the row delimiter comprising one or more symbols that delimit each particular row of a plurality of rows in the data input file; using the row delimiter, identifying a plurality of rows from the sample excerpt that is not included in the header data for the data input file; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; identifying one or more particular candidate column delimiters of the plurality of particular candidate column delimiters in the plurality of rows; identifying, as a plurality of candidate column delimiters, symbols in the sample excerpt that are not contained in either the column delimiter whitelist data or the column delimiter blacklist data; receiving, from a user device, a second row delimiter for the data input file; using the second row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows; using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns as drafted, are processes that, under their broadest reasonable interpretation, cover mental processes but from the recitation of implementing them on generic computer components. That is, nothing in the claim elements preclude the steps from practically being performed in the mind. For example, the “selecting” encompass the user observing, analyzing, and/or judging a subset of the sample excerpt. The limitations, “analyzing the sample excerpt”, “determining the header data”, “in response to determining” encompass the user analyzing and judging the sample excerpt in order to determine whether rows contain delimited values and judging that the row without a delimited value contains header data. The limitations, “analyzing the sample”, “identifying a plurality of rows”, “identifying one or more particular candidate column delimiters” encompass the user observing, analyzing, and/or judging the sample excerpt to determine row delimiters and using them to identify rows and candidate column delimiters for the rows. The limitation “identifying, as a plurality of candidate column delimiters, symbols in the sample excerpt” encompasses the user analyzing, and judging to determine if observed symbols are in a whitelist or blacklist. The limitations, “identifying one or more second candidate column delimiters in the second plurality of rows”, “using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file”, “using the candidate schema for the data input file, “translating the data input file into a plurality of rows and columns” encompass the user observing, analyzing, and/or judging, perhaps with the aid of a pen and piece of paper, the sample excerpt to determine a layout of the data based on the second row and column delimiters and applying the schema to the file data to translate the file data into rows and columns, which can further be written down on the piece of paper. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, claim 19 recites multiple abstract ideas (Step 2A, Prong 1).
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of – receiving a data input file to be stored in a database, the data input file having unknown schema; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns; wherein the method is performed using one or more processors. The one or more processors and database are recited at a high-level of generality (i.e., as generic computer devices performing generic computer functions) and do not meaningfully limit the claim. The additional elements of receiving a data input file to be stored in a database, the data input file having unknown schema; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns represent insignificant extra-solution activities to the judicial exception and are mere data gathering steps. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2).
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of receiving a data input file to be stored in a database, the data input file having unknown schema; storing column delimiter whitelist data comprising a plurality of particular candidate column delimiters; storing column delimiter blacklist data comprising data identifying one or more symbols that are not candidate column delimiters; storing, in a database, the plurality of rows that contain a plurality of columns represent insignificant extra-solution activities that are well-understood, routine, and conventional activities previously known to the industry. That is, these limitations represent well-understood, routine, conventional activities in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. (Step 2B). Accordingly, claim 19 is not patent eligible.
Claims 3-6, 8, 12-14, 17, 20 depend on claims 1, 10 and include all the limitations of these claims. Therefore, claims 3-6, 8, 12-14, 17, 20 are directed to the same abstract idea and the analysis must proceed to (Step 2A, Prong 2).
Claims 3, 12 similarly recite the additional limitations pertaining to extracting column names using header data. This judicial exception is not integrated into a practical application. The additional elements represent further mental process steps of observing header data and judging columns names based on the analyzing of the data. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent further mental process steps. Therefore, these additional limitations are not sufficient to amount to significantly more than the judicial exception. Claims 3, 12 are not patent eligible.
Claims 4, 13 similarly recite the additional limitations pertaining to wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: searching the sample excerpt, and selecting a candidate row delimiter. This judicial exception is not integrated into a practical application. These additional elements represent further mental process steps of observing a sample excerpt and judging a row delimiter. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. These additional steps are considered an abstract idea (mental process step) and do not integrate the judicial exception into a practical application. The additional limitation pertaining to storing row delimiter whitelist data does not integrate the abstract idea into a practical application and merely represents an insignificant extra-solution activity to the judicial exception and is a mere data gathering step. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements limitations pertaining to wherein analyzing the sample excerpt to determine a row delimiter for the data input file comprises: searching the sample excerpt, and selecting a candidate row delimiter represent further mental process steps. As discussed above with respect to integration of the abstract idea into a practical application, the additional element pertaining to storing row delimiter whitelist data represent well-understood, routine, conventional activity previously known to the industry. That is, this limitation represents well-understood, routine, conventional activity in the fields of data processing and/or data storage and retrieval and is merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. Claims 4, 13 are not patent eligible.
Claims 5-6, 14 recite additional limitations pertaining to identifying candidate column delimiters. This judicial exception is not integrated into a practical application. The additional elements represent further mental process steps of judging column delimiters, judging that the sample excerpt does not contain the candidate column delimiters, and further judging symbols as candidate column delimiters that are not found within blacklisted data. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent further mental process steps. Therefore, these additional limitations are not sufficient to amount to significantly more than the judicial exception. Claims 5-6, 14 are not patent eligible.
Claims 8, 17 recite additional limitations pertaining to analyzing the sample excerpt to determine candidate column delimiters. This judicial exception is not integrated into a practical application. The additional elements represent further mental process steps of analyzing the sample excerpt to find an open and closed quotation and judging the symbol following the closed quotation as the column delimiter. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. This additional step is considered an abstract idea (mental process step) and does not integrate the judicial exception into a practical application. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent further mental process steps. Therefore, these additional limitations are not sufficient to amount to significantly more than the judicial exception. Claims 8, 17 are not patent eligible.
Claim 20 recites additional limitations of, receiving, from a user device, a second row delimiter for the data input file; using the row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows; using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns; storing, in a database, the plurality of rows that contain a plurality of columns. The limitations, using the row delimiter, identifying a second plurality of rows from the sample excerpt that is not included in the header data for the data input file; identifying one or more second candidate column delimiters in the second plurality of rows; using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file; using the candidate schema for the data input file, translating the data input file into a plurality of rows and columns as drafted, are processes that, under their broadest reasonable interpretation, cover mental processes but from the recitation of implementing them on generic computer components. That is, nothing in the claim elements preclude the steps from practically being performed in the mind. For example, The limitations, “identifying a second plurality of rows”, “identifying one or more second candidate column delimiters” encompass the user observing, analyzing, and/or judging the sample excerpt to determine second row delimiters and using them to identify rows and second candidate column delimiters for the rows. The limitations, “identifying one or more second candidate column delimiters in the second plurality of rows”, “using the second candidate column delimiter and the second row delimiter to generate a candidate schema for the data input file”, “using the candidate schema for the data input file, “translating the data input file into a plurality of rows and columns” encompass the user observing, analyzing, and/or judging, perhaps with the aid of a pen and piece of paper, the sample excerpt to determine a layout of the data based on the second row and column delimiters and applying the schema to the file data to translate the file data into rows and columns, which can further be written down on the piece of paper. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, claim 20 further recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements of – receiving, from a user device, a second row delimiter for the data input file; storing, in a database, the plurality of rows that contain a plurality of columns. These additional elements represent insignificant extra-solution activities to the judicial exception and are mere data gathering steps. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements represent insignificant extra-solution activities that are well-understood, routine, and conventional activities previously known to the industry. That is, these limitations represent well-understood, routine, conventional activities in the fields of data processing and/or data storage and retrieval and are merely directed to the well-understood, routine, conventional activity of storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). Therefore, these additional elements do not cause the claim to amount to significantly more than the judicial exception. Accordingly, claim 20 is not patent eligible.

Response to Arguments
The following is in response to the amendment filed on 10/04/22.
Applicant’s arguments have been carefully and respectfully considered but are not persuasive.
Regarding 35 USC 101, on pg. 11, applicant argues that claim 2 was not analyzed “as a whole” and is integrated into a practical application.
In response to the preceding argument, examiner respectfully submits that, as an initial matter, claim 2 is currently cancelled and the examiner has updated the current 101 rejection to include the amended limitations in claim 1 which are similar (and not identical) to those of claim 2, as previously presented. Because claim 2 is currently cancelled, applicant’s arguments regarding this claim are moot.
Regarding 35 USC 101, on pg. 11, applicant argues that MPEP 2106.05 expects a technical field, and the office action acknowledges that claims 1 is in the fields of data processing and/or data storage.
In response to the preceding argument, examiner respectfully submits that MPEP 2106.05 identifies that “limitations that the courts have found to qualify as "significantly more" when recited in a claim with a judicial exception include: Improvements to any other technology or technical field”. That is, the applicant’s representative appears to be misinterpreting this portion of the MPEP, which does not state that a technical field recited in a claim qualifies as significantly more, as is suggested. Further, the applicant’s representative also misrepresents the examiner’s office action, which does not recite that the claims provide significantly more, such as by providing an improvement to any other technology or technical field, as is suggested.
Regarding 35 USC 101, on pg. 12, applicant suggests that the claims provide a technical solution to a technical problem, and proceeds to compare the alleged problem and solution to what the applicant’s representative calls the “state of the art” comprising references cited by the examiner in previous office actions. The applicant’s representative further argues that the claims provide improvements and integrate the exception into a practical application.
In response to the preceding argument, examiner respectfully submits that, these claims recite multiple abstract ideas (mental processes and mathematical concepts) and insignificant extra-solution activities that are well-understood, routine, or conventional. Therefore, when analyzed individually or wholly, the claims would not provide any technical solutions to a technical problem, provide any improvements, or integrate the exception into a practical application. That is, the applicant’s representative appears to suggest that the mathematical calculations in the claim result in an increased accuracy in selecting delimiters; however, these limitations merely recite an abstract idea (mathematical calculations), and therefore, would not provide any improvements. Further, the prior art as mentioned by the applicant, is not considered with regard to subject matter eligibility. That is, MPEP 2106.06 states that "Because they are separate and distinct requirements from eligibility, patentability of the claimed invention under 35 U.S.C. 102 and 103 with respect to the prior art is neither required for, nor a guarantee of, patent eligibility under 35 U.S.C. 101. The distinction between eligibility (under 35 U.S.C. 101) and patentability over the art (under 35 U.S.C. 102 and/or 103) is further discussed in MPEP § 2106.05(d)".
Regarding 35 USC 101, on pg. 14, applicant argues that the “receiving” limitation as provided in the new claims provides an improvement by ensuring higher accuracy with the schema inferences when incorrectly identifying row delimiters.
In response to the preceding argument, examiner respectfully submits that the claims do not mention incorrectly identifying row delimiters, therefore, it is not clear if or how the claimed invention would tie to this alleged improvement.
	
	
Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Botner (US 2018/0314711) discloses an entirely automated system for the interpretation of the field layout for multi-field files uses a rich contextual framework;
Saurav (US 2018/0314883) discloses automatic detection of string and column delimiters in tabular data files.
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM P BARTLETT whose telephone number is (469)295-9085.  The examiner can normally be reached on M-Th 11:30-8:30, F 11-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 5712724046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WILLIAM P BARTLETT/
Examiner, Art Unit 2169