Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Detailed Action
Remarks
This communication has been issued in response to Applicant’s amended claim language filed 25 October 2021.  Claims 1-4 & 6-8 remain pending in this application. 
Applicant’s amended language and persuasive arguments in regards to Claims 1-4 & 6-8 being directed towards an abstract idea under 35 USC 101 has been withdrawn.  Applicant has sufficiently demonstrated an improvement in the functioning of a computer and/or technical field through the incorporation of amended language corresponding to Applicant’s disclosure and further definition of a “reliability” metric indicative of probability of success in joining at least a first table and candidate tables. 
 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7 & 8 are rejected under 35 U.S.C. 103 as being unpatentable over Gorelik et al (USPG Pub No. 20150356094A1; Gorelik hereinafter) in view of Mandelstein et al (USPG Pub No. 20130238596A1; Mandelstein hereinafter) further in view of Johnathan et al (USPG Pub No. 20160055205A1; Johnathan hereinafter).

As for Claim 1, Gorelik teaches, A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process, the process comprising: 
 “selecting, by the computer, candidate tables that correspond to the first table from among the second tables, a record of the respective candidate tables including a first data item included in a record of the first table” (see pp. [0011], [0034]; e.g., the reference of Gorelik provides systems and methods for facilitating efficient analysis of vast amounts of data stored and accessed using data storage systems.  According to paragraph [0011], lineage and/or purpose discovery is conducted as a number of different candidate pairs among several files is identified, where each candidate pair includes a respective first file of several files and a respective file of several files, such that the second file was created after the first file. Scores are calculated for the one or more candidate pairs which is indicative of the extent in which the second file of the pair 
“acquiring, by the computer, a first coincidence degree of the first table for the respective candidate tables, the first coincidence degree indicating a number of coincidence between the first data item included in the record of the first table and the first data item included in the record of the respective candidate tables” (see pp. [0012-0014], [0035-0037], [0086]; e.g., Gorelik teaches of obtaining at least one or more of a schema measure, considered equivalent in function to Applicant’s claimed coincidence degree, that can be determined based on an overlap of items contained between two or more schema tables of data.  The cited paragraph [0086] describes a process of schema based analysis for computing a schema-based lineage score/measure for candidate pairs of files, where the overlap of fields and/or determination of fields having the same signature can be discerned.  A matching signature of at least a first schema element can be determined based on one or more of a plurality of factors such as overlap in the data values and overlap in most common data values, for example. As stated within the cited paragraph [0086], “If it is determined that two files are not independent data collections, the fact that those two files have identical or substantially similar schema can indicate that one file was derived from the other. Substantially similar schemas, substantially similar field signatures, and/or significant value overlap generally imply a possibility that one file is derived from another.”);
user selection of one or more of the values of a second schema element presented on the data display can be received and, based thereon, attributes of the second schema element, displayed on the metadata display, and that characterize the selected values, can be highlighted”.  As discussed within paragraph [0042], a meta-data registry is used to determine a signature for each of several files and the one or more respective schema elements of that file, where a particular schema element can be a Graph DB graph or portion of a relational/columnar/in-memory or non-relational database, for example.  The metadata registry, considered equivalent in function to a selected “third table”, stores information corresponding to each of the one or more schema elements of at least a first or second file of several files, reading on Applicant’s claimed limitation.  Information corresponding to the schema elements of the one or more files is displayed to a user using a metadata display of the graphical user interface); 
“acquiring, by the computer, a second coincidence degree of the one of the candidate tables for the respective third tables, the second coincidence degree indicating a number of coincidence between the second data item included in the record If it is determined that two files are not independent data collections, the fact that those two files have identical or substantially similar schema can indicate that one file was derived from the other. Substantially similar schemas, substantially similar field signatures, and/or significant value overlap generally imply a possibility that one file is derived from another”); and
“acquiring, by the computer, a reliability of the respective candidate tables on basis of both the first coincidence degree of the first table for the one of the candidate tables and the second coincidence degree of the one of the candidate tables for the respective third tables” (see Fig. 2(a-b); see pp. [0068-0072]; e.g., the reference of Gorelik teaches of the ordering of the pairwise lineage scores, where the comparison of 
The reference of Gorelik does not appear to explicitly recite the amended limitations of, “receiving, by the computer, a first table for which a maximum likelihood table is selected from among second tables” and “determining, by the computer, the maximum likelihood table for the first table from among the candidate tables based on the reliability of the respective candidate tables, the maximum likelihood table having a highest reliability among the candidate tables”.
The reference of Mandelstein recites the limitation of, “receiving, by the computer, a first table for which a maximum likelihood table is selected from among second tables” (see pp. [0058-0061]; e.g., the reference of Mandelstein serves as an enhancement to the combined teachings of the primary Gorelik reference by providing for acquiring one or more of a first table from a plurality of tables where a calculated 
The combined references of Gorelik and Mandelstein are considered analogous art for being within the same field of endeavor, which is data storage and retrieval systems and techniques for detecting reference data tables in Extract, Transform, and Load (ETL) processes.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the receiving of one or more of a first table having a “greater likelihood” among a plurality of second tables, as taught by Mandelstein, with the method of Gorelik because reference data management solutions are useful in data integration projects as transcoding tables are implemented for harmonizing reference data while data is exchanged between one or more source and target systems.  (Mandelstein; [0004-0005]) 
	The combined references of Gorelik and Mandelstein do not appear to explicitly recite the amended limitations of, “the reliability indicating a probability that the respective candidate tables are successively joined to the first table, and the first table and the candidate tables joining each other to generate a joining chain”, “determining, by the computer, the maximum likelihood table for the first table from among the candidate tables based on the reliability of the respective candidate tables, the maximum likelihood table having a highest reliability among the candidate tables”, and “outputting, by the computer, the maximum likelihood table.”
The reference of Johnathan recites the limitations of, 
	“the reliability indicating a probability that the respective candidate tables are successively joined to the first table, and the first table and the candidate tables joining each other to generate a joining chain” (see pp. [0089-0090], [0116]; e.g., the reference of Johnathan enhances the combined teachings of Gorelik and Mandelstein by providing teachings referring to one or more of a similarity measure for at least a pair of compared fields of at least a first and second table selected as potential candidates when joining/merging tables.  The similarity measure is compared to a threshold to determine sufficient similarity in the values of the data fields to make further analysis worthwhile, as stated within paragraph [0052].  Paragraph [0089] teaches of the utilization of a join tree, produced by a join recommendation engine, and translated into an SQL statement by creating a join based on the identified tables, columns, and directions for each edge.  Subsequent paragraph [0090] discusses the use of an additional component such as a “joinability analyzer”, which “...for a column C1 in table T1 and a column C2 in table T2, a joinability score {i.e. equivalent to Applicant’s “reliability” metric} that represents a predicted level of success {i.e. equivalent to Applicant’s “probability that the respective candidate tables are successively joined”} or a usefulness of an outcome in performing a join from C1 to C2. In on embodiment, when the determined joinability score for C1 to C2 exceeds (or meets) a pre-determined threshold, it is used as a weight for a directed edge added to the join graph 812”, thus, making a determination as to the success of joining table columns, and using a determined weight to connect nodes by adding edge to a join graph as a result.  Paragraph [0116] clearly teaches of the creation of a join graph of nodes, where each node represents one or more of the plurality of database tables having a plurality of columns.  For each pair of a first and second tables from a plurality of database tables different from one another, calculating a score representative of a predicted level of success in performing a join from the first table on the first column to the second table on the second column, and adding a directed weighted edge to the join graph from a node representing the first table to a node representing the second table based on the score in a process equivalent to the generation of Applicant’s “joining chain”); 
“determining, by the computer, the maximum likelihood table for the first table from among the candidate tables based on the reliability of the respective candidate tables, the maximum likelihood table having a highest reliability among the candidate tables” (see pp. [0141-0145]; e.g., Paragraphs [0141-0145] teach additional aspects of Johnathan, such as the generation of a ranked list of identified fields of columns of a plurality of tables through suggestion/recommendation, and presenting selected identified fields on a display, reading on Applicant’s claimed limitation.  Within a ranked list of identified fields pertaining to a plurality of candidate tables having those identified fields, the top ranked candidate is responsive to a measured likelihood of matching values in identified fields of columns between at least a first and second data set.  Earlier text of paragraphs [0042-0044] discuss a first table from a first database and a represents a predicted level of success {i.e. equivalent to Applicant’s “probability that the respective candidate tables are successively joined”} or a usefulness of an outcome in performing a join from C1 to C2. In on embodiment, when the determined joinability score for C1 to C2 exceeds (or meets) a pre-determined threshold, it is used as a weight for a directed edge added to the join graph 812”, thus, aiding in a determination as to the success of joining table columns, and using a determined weight to connect nodes by adding edge to a join graph as a result); and 
	“outputting, by the computer, the maximum likelihood table” (see pp. [0034], [0085], [0089]; e.g., paragraph [0034] briefly discusses the utilization of a “recommendation engine”, which takes statistical results as input, and derives one or more recommended joins.  A list of recommended joins can be output, with the list being sorted or unsorted, and presented to a user through a display or other output device.  In a similar fashion fashion, paragraph [0085] teaches of the use of a “join recommendation engine”, which receives tables through user selection and/or an automated system, and allows for the selection of required joins and edges for the generation of a join tree that includes nodes and edges of a join graph connecting all 
	The combined references of Gorelik, Mandelstein and Johnathan are considered analogous art for being within the same field of endeavor, which is data storage and retrieval systems and techniques for detecting reference data tables in Extract, Transform, and Load (ETL) processes.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the determination of metrics such as a reliability and likelihood metric for the joining/merging of two or more database tables, as taught by Johnathan, with the methods of Mandelstein and Gorelik in order to alleviate the complexities of joining tables arising from separate and distinct databases with different table structures without any design coordination or if data sets are arbitrary and generated from unstructured data.  (Johnathan; 0005])

Independent Claims 7 & 8 amount to a method and apparatus, respectively, comprising instructions that, when executed by one or more processors, performs the instructions performed by the non-transitory computer-readable recording medium of : see pp. [0156-0158]; e.g., method for implementation integrating hardware and software components).


Claims 2-4 & 6 are rejected under 35 U.S.C. 103 as being unpatentable over Gorelik et al (USPG Pub No. 20150356094A1; Gorelik hereinafter) in view of Mandelstein et al (USPG Pub No. 20130238596A1; Mandelstein hereinafter) further in view of Johnathan et al (USPG Pub No. 20160055205A1; Johnathan hereinafter), yet in even further view of Parayatham et al (USPG Pub No. 20170344890A1; Parayatham hereinafter).

As for Claim 4, the reference of Gorelik teaches facilitating efficient analysis of vast amounts of data stored and accessed using data storage systems, Mandelstein teaches of the retrieval of one or more of a “greater likelihood” designated table(s) from a plurality of tables having corresponding data, and Johnathan providing teachings referring to one or more of a similarity measure for at least a pair of compared fields of at least a first and second table selected as potential candidates when joining/merging tables.
The combined references of Gorelik, Mandelstein and Johnathan are considered analogous art for being within the same field of endeavor, which is data storage and 
The references of Gorelik, Mandelstein and Johnathan do not appear to explicitly recite the limitations of, “acquiring the reliability of the one of the candidate tables by multiplying or adding the first coincidence degree of the first table for the one of the candidate tables and the second coincidence degree of the one of the candidate tables for the respective third tables”.
Parayatham teaches, “acquiring the reliability of the one of the candidate tables by multiplying or adding the first coincidence degree of the first table for the one of the candidate tables and the second coincidence degree of the one of the candidate tables for the respective third tables” (see pp. [0093-0097]; e.g., the reference of Parayatham serves as an enhancement to the teachings of Gorelik, Mandelstein and Johnathan by providing a process of determining the reliability of significant patterns {values/variables/objects/attributes} of one or more of a plurality of data records and applying techniques using input variables for such techniques as minimum probability, maximum expectations, confidence, significance and number of discrete intervals, in 
The combined references of Gorelik, Mandelstein and Johnathan and Parayatham are considered analogous art for being within the same field of endeavor, which is data storage and retrieval systems and automatic analysis of large quantities of data to extract using all reliable, significant and relevant patterns that occurred in the data.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the calculation of a reliability metric for each of a plurality of records, as taught by Parayatham, with the 

As for Claim 6, the reference of Gorelik teaches facilitating efficient analysis of vast amounts of data stored and accessed using data storage systems, Mandelstein teaches of the retrieval of one or more of a “greater likelihood” designated table(s) from a plurality of tables having corresponding data, and Johnathan providing teachings referring to one or more of a similarity measure for at least a pair of compared fields of at least a first and second table selected as potential candidates when joining/merging tables.
The references of Gorelik, Mandelstein and Johnathan do not appear to explicitly recite the limitations of, “determining maximum likelihood tables for respective fourth tables by setting the respective fourth tables as the first table; selecting a first maximum likelihood table from among the maximum likelihood tables, the first maximum likelihood table having a highest reliability among the maximum likelihood tables; and outputting the first maximum likelihood table”.
Parayatham teaches, “determining maximum likelihood tables for respective fourth tables by setting the respective fourth tables as the first table; selecting a first maximum likelihood table from among the maximum likelihood tables, the first maximum likelihood table having a highest reliability among the maximum likelihood tables; and outputting the first maximum likelihood table” (see pp. [0093-0097], [0196-0201]; e.g., the reference of Parayatham serves as an enhancement to the teachings of Gorelik, 
The combined references of Gorelik, Mandelstein, Johnathan and Parayatham are considered analogous art for being within the same field of endeavor, which is data storage and retrieval systems and automatic analysis of large quantities of data to extract using all reliable, significant and relevant patterns that occurred in the data.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the calculation of a reliability metric for each of a plurality of records, as taught by Parayatham, with the methods of Johnathan, Mandelstein and Gorelik because current pattern searching methods do not consider all the values of the attribute in all the regions in the same way.  (Parayatham; 
As for Claim 2, the reference of Gorelik teaches of, “acquiring the first coincidence degree of the first table for the respective candidate tables by calculating a ratio of a number of first records of the first table with respect to a total number of records of the first table, the first data item included in the respective first records having a same value as a value of the first data item included in a record of the relevant candidate table” (see pp. [0012-0014], [0035-0037], [0086]; e.g., Gorelik teaches of obtaining at least one or more of a schema measure, considered equivalent in function to Applicant’s claimed coincidence degree, that can be determined based on an overlap of items contained between two or more schema tables of data.  The cited paragraph [0086] describes a process of schema based analysis for computing a schema-based lineage score/measure for candidate pairs of files, where the overlap of fields and/or determination of fields having the same signature can be discerned.  A matching signature of at least a first schema element can be determined based on one or more of a plurality of factors such as overlap in the data values and overlap in most common data values, for example. As stated within the cited paragraph [0086], “If it is determined that two files are not independent data collections, the fact that those two files have identical or substantially similar schema can indicate that one file was derived from the other. Substantially similar schemas, substantially similar field signatures, and/or significant value overlap generally imply a possibility that one file is derived from another”). 

If it is determined that two files are not independent data collections, the fact that those two files have identical or substantially similar schema can indicate that one file was derived from the other. Substantially similar schemas, substantially similar field signatures, and/or significant value overlap generally imply a possibility that one file is derived from another”). 

Response to Arguments
Applicant's arguments and amendments, with respect to the rejections of Claims 1-4 & 6-8 under 35 USC 103 have been fully considered and are persuasive in-part, as the Gorelik, Mandelstein and Parayatham et al references have been maintained for their respective teachings, however, upon further consideration, a new ground(s) of rejection is made in view of Johnathan et al (USPG Pub No. 20160055205A1; Johnathan hereinafter).

With respect to Applicant’s argument that:
	“However, as Applicant understands it, there is simply nothing in Gorelik that discloses, for example, that “the reliability indicating a probability that the respective candidate tables are successively joined to the first table, and the first table and the candidate tables joining each
other to generate a joining chain,” as specifically recited in each of amended claims 1, 7 and 8 discussed above.

	As a result, Applicant believes that Gorelik further fails to disclose “determine[ing, by the computer,] the maximum likelihood table for the first table from among the candidate tables based on the reliability of the respective candidate tables, the maximum likelihood table having a highest reliability among the candidate tables,” as specifically recited in each of amended claims 1, 7 and 8 discussed above. [Emphasis added]

	As Applicants understand it, none of the secondary references including Mandelstein and Parayatham shows or suggests the feature of amended claim 1 discussed above, i.e., these secondary references fail to remedy the deficiencies of Gorelik.

	In view of the above, Applicant respectfully submits that each of amended claims 1, 7, and 8 is neither anticipated by nor rendered obvious over the reference cited by the Office Action (i.e., Gorelik, Mandelstein and Parayatham), either taken alone or in combination, for at least the reasons discussed above.

	Reconsideration and withdrawal of the rejections of claims 1, 7, and 8 under 35 U.S.C. §103 is respectfully requested.”

Examiner is persuaded in-part, as the amended language involving “the reliability indicating a probability...” and “outputting...the maximum likelihood table” has been for a column C1 in table T1 and a column C2 in table T2, a joinability score {i.e. equivalent to Applicant’s “reliability” metric} that represents a predicted level of success {i.e. equivalent to Applicant’s “probability that the respective candidate tables are successively joined”} or a usefulness of an outcome in performing a join from C1 to C2. In on embodiment, when the determined joinability score for C1 to C2 exceeds (or meets) a pre-determined threshold, it is used as a weight for a directed edge added to the join graph 812”, thus, making a determination as to the success of joining table columns, and using a determined weight to connect nodes by adding edge to a join graph as a result.  


Conclusion
	The prior art made of reference and not relied upon is considered pertinent to Applicant’s disclosure.
	***Slager et al (US Patent No. 10,599,395A1) teaches dynamically merging database tables.	
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHEEM HOFFLER whose telephone number is (571)270-1036. The examiner can normally be reached Monday-Friday: 10:00am-2:00pm; 6pm-10:00pm w/ flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 5712724241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is 
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2156                                                                                                                                                                                                        
/RAHEEM HOFFLER/
Examiner
Art Unit 2156

								3/15/2022