DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office action is responsive to the following communication:  Request for Continued Examination filed on 23 November 2020.
Claim(s) 1-7, 9, 11-16, 18, and 20-23 is/are pending and present for examination.  Claim(s) 1, 11, and 20 is/are in independent form.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 23 November 2020 has been entered.
 
Response to Amendment
Claims 1, 11, and 20 have been amended.
Claims 8, 10, 17, and 19 have been cancelled.
Claims 21-23 have been newly added.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11 December 2020 is being considered by the examiner.
Examiner’s Note
Examiner cites particular columns and/or paragraphs and line numbers in the references as applied to claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may be applied as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirely as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 6, 7, 9, 11, 12, 15, 16, 18, and 20-23ver Raufman, USPGPUB No. 2014/0129530, filed on 27 June 2012, claiming priority to 27 June 2011, and published on 8 May 2014, in view of Ahal et al, USPGPUB No. 2007/0266053, filed on 12 December 2006, and published on 15 November 2007, in view of Wang et al, USPGPUB No. 2016/0070707, filed on 5 April 2013, and published on 10 March 2016, and in further view of Chu et al, USPGPUB No. 2017/0286459, filed on 30 March 2016, and published on 5 October 2017, and in further view of Fontoura et al, USPGPUB No. 2005/0165838, filed on 26 January 2004, and published on 29 July 2005.
As per independent claims 1, 11, and 20, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
A database system comprising:
a processor {See Raufman, [0282], wherein this reads over “Processor unit 1104 is connected to a communication infrastructure 1102, which may comprise, for example, a bus or a network”};  and
memory coupled to the  processor and storing instructions that, when executed by the processor, {See Raufman, [0283], wherein this reads over “Computer system 1100 also includes a main memory 1106, preferably random access memory (RAM), and may also include a secondary memory 1120.  Secondary memory 11”} cause the database system to perform operations comprising:
identifying a fixed amount of random access memory (RAM) for use in sorting records from one or more database files in bounded memory {See Ahal, [0103], wherein this reads over “Specifically, when a lot of data is allocated to a single bin, as is the case with "hot spots", the RAM required to sort such data may not be available within the DPA, since the RAM within the DPA may not be large enough to load all of the data stored on disk.” And “Instead of loading all of the data within a bin into RAM, the bin is divided into sub-areas according to time. For example, the bin may be divided into sub-areas corresponding to one million write transactions. Data from each sub-area is loaded into RAM, sorted, and written back into the bin”};
sequentially streaming records from a first dataset and a second dataset electronically stored by the database system in the one or more database files {See Raufman, [0008], wherein this reads over “The invention may be embodied as a serial database system designed for quick load, efficient store and quick access for huge data sets.  The system includes a data container saved in computer readable memory/files holding data set records, possibly in encoded format ordered by their loading order”; and [0009], wherein this reads over “streaming row data records, composed of one or more columns, to a data loader program, wherein the row data may be in a csv format or any other predefined agreed format; adding the record of values sequentially to a pre-allocated computer readable data memory block”; and [0214], wherein this reads over “In step 200 the system waits for a batch of records to be forwarded after fetch.  Raw data 105 is provided as stream of raw records to the loader process 110”} within the fixed amount of RAM;
generating, based on the records from the first dataset and the second dataset, an inverted index data structure that maps respective content within the records to respective locations in the one or more database files {See Raufman, [0009], wherein this reads over “for each predefined number of records to be called "block size" (&gt;=transaction size), creating inverted indexes for these record called block's inverted indexes, wherein the inverted index are created per column.  The inverted indexes block is a container for the columns inverted indexes”};
generating, based on the inverted index data structure and a key, a set of matching tuples {See Wang, [0013], wherein this reads over “if in the database, a tuple is regarded as a node, represented by V, and a primary key-foreign key relationship as an edge, represented by E, then the database may be represented as a graph G(V, E)”; and [0019], wherein this reads over “In one example, a tuple is regarded as a document, wherein the body of the document contains all the values stored in the columns of the tuple.  Further, each document is identified by a unique document ID.  In one example, the concatenation of table ID of the tuple and the primary key value of the tuple may be regarded as the document ID.  Thereafter, the inverted index is generated on all the tuples of all the relational tables”; and [0022], wherein this reads over “In operation, on receiving a keyword search query Q, a search may be conducted for detecting the presence of one or more of the keywords in the inverted index of the database to identify all relevant documents, i.e., the tuples.  In one example, the tuples may be identified based on the presence of the keywords.  Thereafter, a score function is computed between the query Q and each relevant document D”};
first sort process {See Chu, [0004], wherein this reads over “receiving, by one or more processors, a set of data to be sorted, wherein the set of data includes a number of tuples; determining, by one or more processors, a number of iterations of radix sorting to perform on keys of the set of data based, in part, on the number of tuples”; [0014], wherein this reads over “In a radix sort, each key is figuratively placed into a bucket, where a key is generally the next byte (or group of bytes) to be sorted”; and [0063], wherein this reads over “In this embodiment, memory 406 includes random access memory (RAM) 416 and cache memory 418.”};
writing data associated with the matching tuples that exceeds the fixed amount of RAM to a secondary storage in communication with the database system {Fountura, [0057], wherein this reads over “when a sort buffer 344, 346 is full, the sort buffer 344, 346 is handed off to an appropriate sort thread that performs radix sort and writes the sorted run to storage.”};
sorting the data associated with the matching tuples written to the secondary storage using a second sort process {See Fontoura, [0043], wherein this reads over “There are many variations on indexing and on compressing posting lists.  One approach is a sort-merge approach, in which sets of data are read into memory, each set sorted, and each set copied to storage (e.g., disk), producing a series of sorted runs.  The index is generated by merging the sorted runs.”}; and
generating, based on the sorted set of matching tuples, a new dataset joining elements from the first dataset and the second dataset {See Wang, [0021], wherein this reads over “On execution, the reformulated query template shall result in the generation of all the primary key combinations for each join result, which may be recorded as the join index”}. 
	Raufman is directed to the invention of fact loading, storing, and access to huge data sets in real time.  Raufman discloses the claimed features of “sequentially streaming records from a first dataset and a second dataset electronically stored by the database system in one or more database files” and 
Ahal is directed to the invention for multiple point in time access.  Specifically, Ahal discloses that “when a lot of data is allocated to a single bin, as is the case with "hot spots", the RAM required to sort such data may not be available within the DPA, since the RAM within the DPA may not be large enough to load all of the data stored on disk.”  See Ahal, [0103].  Additionally, Ahal discloses that “[i]nstead of loading all of the data within a bin into RAM, the bin is divided into sub-areas according to time” such that “the bin may be divided into sub-areas corresponding to one million write transactions” and “[d]ata from each sub-area is loaded into RAM, sorted, and written back into the bin.”  See Ahal, [0103].  That is, Ahal discloses a system wherein a specific amount of RAM (i.e. a fixed amount of RAM) may be used in storing bins of data (i.e. database files in bounded memory) for the purposes of sorting.  Accordingly, wherein both Ahal is directed to providing sorted lists of data utilizing bins which are loaded into RAM, it would have been obvious to one of ordinary skill in the art to improve the prior art of Raufman with that of Wang for the predictable result of a fast loading, storing, and access system for data sets (as disclosed by Raufman) with the aforementioned RAM allocation features (as disclosed by Wang).
Wang is directed to the invention of keyword searching on databases.  Specifically, Wang discloses the use of inverted indices and tuples to determine matches.
	As per the claimed feature of “generating, based on the inverted index data structure and a key, a set of matching tuples,” Wang discloses that “a tuple is regarded as a document, wherein the body of the document contains all the values stored in the columns of the tuple” and “a tuple is regarded as a node, represented by V, and a primary key-foreign key relationship as an edge, represented by E” (i.e. the claimed feature of “a key”).  See Wang, [0013] and [0019]. Furthermore, Wang discloses that “on receiving a keyword search query Q, a search may be conducted for detecting the presence of one or more of the keywords in the inverted index of the database” (i.e. the claimed feature of “based on the 
As per the claimed feature of “generating, based on the sorted set of matching tuples, a new dataset joining elements from the first dataset and the second dataset,” Wang discloses that “On execution, the reformulated query template shall result in the generation of all the primary key combinations for each join result, which may be recorded as the join index.”  See Wang, [0021].  That is, Wang discloses that a join index (i.e. the claimed feature of “a new dataset”) may be generated based upon the primary key combinations for each join result (i.e. based on the sorted set of matching tuples).
Accordingly, wherein both Raufman and Wang are directed to utilizing inverted indices in executing queries, it would have been obvious to one of ordinary skill in the art to improve the prior art of Raufman with that of Wang for the predictable result of a fast loading, storing, and access system for data sets (as disclosed by Raufman) which allows for the matching of tuples and the generation of a new data set according to match results found from an inverted index (as disclosed by Wang).
As per the claimed feature of “sorting the set of matching tuples based on the key within the fixed amount of RAM using a first sort process,” the combination of Raufman and Wang fails to disclose said features.  Chu is directed to the invention of increasing radix sorting efficiency utilizing a cross over point.  
Specifically, as per the claimed feature of “sorting the set of matching tuples based on the key within the fixed amount of RAM using a first sort process,” Chu discloses “receiving, by one or more processors, a set of data to be sorted, wherein the set of data includes a number of tuples” and ”determining, by one or more processors, a number of iterations of radix sorting to perform on keys of the set of data based, in part, on the number of tuples” and that “memory 406 includes random access memory (RAM) 416 and cache memory 418.”  See Chu, [0004] and [0063].  That is, Chu discloses that data which includes tuples (i.e. a set of matching tuples) may be sorted.  With regards to “sorting… based on the key,” it is noted that Chu further discloses that “In a radix sort, each key is figuratively placed into a bucket, where a key is generally the next byte (or group of bytes) to be sorted.”  Id, [0016].  Accordingly, wherein Chu discloses the sorting of tuples based upon a key, it would have been obvious to one of ordinary skill in the art to improve the prior art combination of Raufman and Wang with that of Chu for the 
As per the claimed feature of “writing data associated with the matching tuples that exceeds the fixed amount of RAM to a secondary storage in communication with the database system” and “sorting the data associated with the matching tuples written to the secondary storage using a merge sort process,” the combination of Raufman, Wang, and Chu fails to disclose said features.
Fontoura is directed to the invention of architecture for an indexer.  Specifically, Fontoura discloses that “when a sort buffer 344, 346 is full, the sort buffer 344, 346 is handed off to an appropriate sort thread that performs radix sort and writes the sorted run to storage.”  See Fountura, [0057].  That is, Fountura provides for a sort buffer which may be stored in RAM such that upon an overflow incident with the sort buffer, the sorted run is written to storage, which may be an internal storage device such as a hard drive.  Wherein Fontoura is directed to indexing data, it would have been obvious to one of ordinary skill to improve the indexing systems of Raufman and Wang with that of Fontoura for the predictable result of a system wherein when data related to matching tuples exceeds a cache, the remainder of said data may be written to storage via the sort buffer of Fontoura.
As per the claimed feature of “sorting the data associated with the matching tuples written to the secondary storage using a second sort process,” Fontoura discloses that “[t]here are many variations on indexing and on compressing posting lists” wherein “[o]ne approach is a sort-merge approach, in which sets of data are read into memory, each set sorted, and each set copied to storage (e.g., disk), producing a series of sorted runs” and “[t]he index is generated by merging the sorted runs.” Id, [0043].  Accordingly, wherein Fontoura discloses a sort-merge method for generating an index, it would have been obvious to one of ordinary skill in the art to improve the prior art combination via the disclose sort approach to provide an optimized index.
As per dependent claims 2 and 12, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 1, wherein the memory further stores instructions for causing the database system to receive, from a user system in communication with the database system, an electronic communication identifying one or more of: the first dataset, the second dataset, and the key {See Wang, [0019], wherein this reads over “In one example, a tuple is regarded as a document, wherein the body of the document contains all the values stored in the columns of the tuple.  Further, each document is identified by a unique document ID.  In one example, the concatenation of table ID of the tuple and the primary key value of the tuple may be regarded as the document ID.  Thereafter, the inverted index is generated on all the tuples of all the relational tables”; and [0022], wherein this reads over “In operation, on receiving a keyword search query Q, a search may be conducted for detecting the presence of one or more of the keywords in the inverted index of the database to identify all relevant documents, i.e., the tuples.  In one example, the tuples may be identified based on the presence of the keywords.  Thereafter, a score function is computed between the query Q and each relevant document D”}. 
As per dependent claim 3, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 1, wherein content mapped by the inverted index data structure includes: a text string, an alphanumeric string, a numeric value, or combinations thereof {See Wang, [0017], wherein this reads over “Another approach of keyword based searches involves identifying the tuples which contain the keywords and then combining the identified tuples to form view tuples by using primary key-foreign key connections.  This approach reduces space maintenance overhead but increases processing load”}. 
As per dependent claims 6 and 15, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 1, wherein the memory further stores instructions for causing the database system to store the new dataset in the one or more database files {See Raufman, [0083], wherein this reads over “The invention includes a data base management system with methods for loading, storing and accessing huge data sets efficiently.  The system introduces a new data structure and a new type of index that enables fixed O(1) access time to elements in huge data sets while utilizing economic memory space requirements of .about.O(n) where n is the size of the set, and a new compression method based on the invented data structure and index”}
As per dependent claims 7 and 16, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 6, wherein the new dataset is stored in a database file containing one or more of the first dataset and the second dataset {See Raufman, [0083], wherein this reads over “The invention includes a data base management system with methods for loading, storing and accessing huge data sets efficiently.  The system introduces a new data structure and a new type of index that enables fixed O(1) access time to elements in huge data sets while utilizing economic memory space requirements of .about.O(n) where n is the size of the set, and a new compression method based on the invented data structure and index”}. 
As per dependent claims 9 and 18, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 8, wherein the data written to the hard drive is compressed {See Fontoura, [0066], wherein this reads over “A final index thread 442 outputs compressed index files to storage 444”}. 
As per dependent claims 21-23, Raufman, in combination with Ahal, Wang, Chu, and Fontoura, discloses:
The database system of claim 1, wherein the first sort process is a radix sort process {See Chu, [0004], wherein this reads over “receiving, by one or more processors, a set of data to be sorted, wherein the set of data includes a number of tuples; determining, by one or more processors, a number of iterations of radix sorting to perform on keys of the set of data based, in part, on the number of tuples”; [0014], wherein this reads over “In a radix sort, each key is figuratively placed into a bucket, where a key is generally the next byte (or group of bytes) to be sorted”; and [0063], wherein this reads over “In this embodiment, memory 406 includes random access memory (RAM) 416 and cache memory 418.”} and the second sort process is a merge sort process {See Fontoura, [0043], wherein this reads over “There are many variations on indexing and on compressing posting lists.  One approach is a sort-merge approach, in which sets of data are read into memory, each set sorted, and each set copied to storage (e.g., disk), producing a series of sorted runs.  The index is generated by merging the sorted runs.”}.
radix sort, each key is figuratively placed into a bucket, where a key is generally the next byte (or group of bytes) to be sorted.”  Id, [0016].  Accordingly, wherein Chu discloses the sorting of tuples based upon a key, it would have been obvious to one of ordinary skill in the art to improve the prior art combination of Raufman and Wang with that of Chu for the predictable result of a system wherein tuple data of Wang may be further sorted according to the sorting method of Chu.
As per the claimed feature of “a merge sort process,” Fontoura discloses that “[t]here are many variations on indexing and on compressing posting lists” wherein “[o]ne approach is a sort-merge approach, in which sets of data are read into memory, each set sorted, and each set copied to storage (e.g., disk), producing a series of sorted runs” and “[t]he index is generated by merging the sorted runs.” Id, [0043].  Accordingly, wherein Fontoura discloses a sort-merge method for generating an index, it would have been obvious to one of ordinary skill in the art to improve the prior art combination via the disclose sort approach to provide an optimized index.
Claims 4, 5, 13, and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Raufman, in view of Wang, Chu, and Fontoura, and in further view of Cao et al, USPGPUB No. 2017/0091305, filed on 30 September 2015, and published on 30 March 2017.
As per dependent claims 4 and 13, the combination of Raufman, Wang, Chu, and Fontoura fails to disclose the claimed feature of “wherein the locations to which content is mapped by the inverted index data structure correspond to integer values.”  Cao is directed to the invention of the dynamic grouping of tuples.  Specifically, Cao discoses that “[t]he tuples may have non-relational database relationships to other tuples of a stream application (e.g., individual values, key-value pairs, flat files, etc.)” and that “[t]uples may include values in a variety of known computer formats (e.g., integer, float, Boolean, string, etc.)”}.  See Cao, [0021]. Additionally, Cao discloses that “[t]he smart tuple may retrieve, at 442, information related to the set of identified tuples, such as the location within the stream application.”  See Cao, [0064].  Accordingly, wherein Cao discloses that smart tuples, which may take various formats such as an integer, may indicate the location of identified tuples, it would have been obvious to one of ordinary skill in the art to improve the prior art combination of Raufman and Wang with that of Cao for the predictable result of a dataset access system which may have tuples which further indicate a location and correspond to specific integer values.
As per dependent claims 5 and 14, Raufman, in combination with Ahal, Wang, Chu, Fontoura, and Cao, discloses:
The database system of claim 4, wherein the set of matching tuples are integer tuples associated with one or more of: a row identifier, a dimension value identifier, and a measure value {See Cao, [0021], wherein this reads over “The tuples may have non-relational database relationships to other tuples of a stream application (e.g., individual values, key-value pairs, flat files, etc.).  Tuples may include values in a variety of known computer formats (e.g., integer, float, Boolean, string, etc.)”}. 

Response to Arguments
Applicant’s arguments with respect to the claim rejections under 35 U.S.C. 103 have been considered but are moot because the arguments do not apply to the newly cited prior art combination being used in the current rejection.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL KIM whose telephone number is (571)272-2737.  The examiner can normally be reached on Monday-Friday, 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Paul Kim/
Examiner
Art Unit 2169



/PK/