Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed against amended claims of 11/20/2020 have been fully considered but they are not persuasive. 
{A} In re page 13, applicant states, “Neither Crupi nor Simitsis, either alone or in combination, disclose teach or suggest representative claim 1. As a general matter, Applicant does not observe in either Simitsis or Crupi (a) aspects related to the single syntax, including (i) receiving the one or more parameters that include, a location of input data for each dataset, and (ii) receiving, a read command using the single syntax. 
Relatedly, Applicant does not observe in either Simitsis or Crupi “determining, based on a location of a respective dataset, a storage format of the respective dataset, a storage environment of the respective dataset, or both, and creating the respective datastore object, such that the respective datastore object is adapted to the storage format and the storage environment of the respective dataset,” as claimed.
In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.

	In response as pointed out in the rejection set fourth previously and presently, Simitsis as understood does teach a Single Syntax, with respect to Figs. 2 (GUI) and Fig. 4, substantially, as claimed. 
This single syntax, allows for, processing, in a sequence of operations, the operations are integrated into a Single Syntax, w/an Interface (SEE Fig. 2 or a GUI), including, accessing data from plural sources, of formatted data, including of different storage different formats, defined by, their Schema.
As appears defined by the Map Reduce Code as illustrated.
SEE 0018, source Format defined by its Schema. 

In another example, the "xLM" code may capture operational properties (e.g., operation type, data schema, operation statistics, parameters or expressions for implementing an operation type, or execution environment details).
	
Note above, the xLM code captures, the operational properties, including, the schema (defining, a DB Format), appears is as claimed, determined based on the location of a respective dataset, since of the datasets.

Additionally, it is noted in Simitsis, is network based system, wherein networked elements, w/addresses, a form of location data, to access the networked sources.

If not clear the examiner cites, what is deemed conventional, associated with Map Reduce, on a network, therefore had at least one address, in additional to locations of store data at the address in, a CLOUD Network.

Note Map Reduce, allows for a Model also appears is a single Syntax for processing very large data sets in parallel since is in a Model reads as a Single Syntax.
SEE below
[0008] As noted above, ETL tools allow users to specify a sequence of operations that process data from various sources or that perform other types of functions. These tools may also convert user specified operations into executable code. As infrastructure and data become more diverse, an entire sequence of operations may not be suitable for execution in just one environment. While some operations may work well in any execution environment, other operations may be more appropriate for a particular environment. For example, in one operation, a map reduce cluster on a cloud network may be better suited for analyzing log files and, in a second operation, standard query language ("SQL") may be better suited for joining the results of the analyses with a data base table. In one example, map reduce may be defined as a programming model for processing very large data sets in parallel.


Note, the syntax is seen as claimed being single by being INTEGRATED, does include, at least, parameters including, Location Data in 204, tweetfeed and IDs, w/Temporal Data.
The data corresponds to at least one of more parameters such as: names and temporal parameters or location data.
See location parameters of input data, for each dataset in code 204, includes “Day and Filter Range {TWEETDay > 731947}), and the system based on the single syntax (or Model), performs ETL processing, based on Figs. 2, 4 and 5.

Additionally if not clear, with respect to Fig. 2, does perform receiving, a read command, using the single syntax and at 0023, data sources are loaded to nodes of the tree, appears requires, read commands, to extract the data from the sources, to store to the tree of nodes.


 “Alternatively, a node in the tree may represent an execution environment into which a data source may be loaded.”, therefore loading is from the data sources to tree nodes.

Therefore, based on Fig. 4, defined by Nodes being a Tree of Nodes is, an example of a Single Syntax, defined by a TREE of nodes including, defined Data Sources (406), as well as integrated operations (1, 2), in One TREE, as shown in Fig. 4, is associated, with Fig. 2, as well, the further processes in Figs. 3 & 5 (Cost), generating results such as in Fig. 6, appears can be said is all of, a single syntax, since all is integrated.

Note at 0023, in consideration of Loading the Data (from a source), the system is adapted to, Represent the COST OF LOADING (see Fig. 5), the data into the execution environment represented by node 404 and C2 may represent the cost of loading the data into the execution environment represented by node 406. A path from the root node to a leaf node may represent a combination of execution environments that may be used to implement the sequence of operations.
The above represents, read methods from sources to nodes to operations or ETL operations by Single Syntax (Fig. 2, 202 
 
The above, is based on the tree (400) or Fig. 2, reads, are performed by receiving, a read command, using the single syntax (TREE in Fig. 4), which data sources are, extracted and loaded (see code 204), into the tree (Fig. 4).

The single syntax, in Fig. 4, being, a TREE or 202, having linked, operations (see at Least Operations 1-4 or 414, 430), on data sources can load the data (to tree nodes), from the data sources based on their schema and corresponding, Single Syntax structures, as understood by the examiner.

In Fig. 2, 204, is Code of Tab 212, is Map Reduce code, includes to extract to load, TWEET Data based on location data, and to filter (Temporal), with the parameters, including locations (parameters), defined by IDs and names, as well as, times (location and/or address, that point), corresponding to the MAP reduce code and the Tree of operations based on the sources loaded to the nodes of the tree to generate output data.


Note, entered by a user w/GUI 200, the sequence of operations is coordinated across different execution environments, represent the READ METHOD, wherein, 
“…a sequence of operations may be read…”

[0020] As shown in block 302 of FIG. 3, a sequence of operations may be read. As noted above, the sequence of operations may be entered by a user via GUI 200. The user may also send a request to convert a sequence of user-specified operations into executable code. Selection of the execution environment may be based on a metric associated therewith. The metric may be partially based on resource consumption and/or resources consumed when the sequence of operations is coordinated across different execution environments. Such metrics may be stored as standing data that may be configured in advance by an administrator. Furthermore, such metrics may be derived by executing benchmark programs in each candidate execution environment.

SEE 0022
[0022] In block 306, if it is determined that some operations in the sequence should be implemented in a map reduce execution environment, the map reduce execution environment may be adjusted such that a predefined performance objective of the sequence is achieved. In one example, an amount of data stored in a backup repository during execution of an operation may be adjusted. Such an adjustment may be made by balancing speed requirements and fault tolerance requirements in accordance with the performance objective. In other examples, different configuration adjustments may be made, including, but not limited to: a number of map or reduce tasks to execute in parallel; a number of reducers per task; block size of a file system used by map reduce; map reduce job scheduler; buffer size for sorting or merging; number of parallel copy operations; java heap size; and, amount of nodes to use in a cluster of computers carrying out the map reduce operation. It is understood that the foregoing is a non-exhaustive list of possible configurations and that each type of map reduce execution environment may have many different types of configurable environment variables. The variables may be configured as, for example, command line parameters, a configuration file, or the like.


Additionally Crupi in the same field of endeavor (ETL), also teaches, various locations and/or addresses, as well as, network based processing (See wrapper, 0051), accessing of remote sources of different storage data formats.
SEE Addresses including, corresponding to Network based sources, as well, as their data parameters, including locations by time (0409), or names and/or network addressing with network addressing.
SEE Network Addresses (Link, IP and/or URL, 0177, 0178 and at least 0248), are also locations (w/Time), of data at sources.
SEE 0409, Crupi, utilizes an address or addresses, being location data.
[0072] FROM Clause syntax [0073] SELECT NAME, ADDRESS FROM congress/legislators/legislator 
[0075] SELECT NAME, ADDRESS FROM congress 
record path" is inferred by walking through the hierarchy of the data by looking for "repeating" nodes/elements in the data.

Also reference, addresses or locations defined by, URL (network source address), Mashup w/link, or address necessary to access resources datasets, on the network, through the INTERNET. 
 	[0248] Services addressable by URL 

Description of Disclosure - DETX (241):
Accessing files can be useful, but in most cases the datasets you want to work with come from databases or from other systems or applications. If applications provide a REST or Web Service interface, you can access and load data using <directinvoke> and the appropriate URL. We're going to load a CSV dataset with information on global manufacturing plants that is accessible from http://raw.github.com/jackbe/raql/master/data/mfgplants.csv. Then we will use the Where clause to filter the rows the mashup should work with. 1. Login to Presto and click RAQL Explorer in the main menu. This is the RAQL Explorer that you can use to explore RAQL queries. You can use this tool to play with results and queries when your dataset is accessible as a file or from a URL. See Explore RAQL with the Presto RAQL Explorer for more information about this tool. 2. Enter http://raw.github.com/jackbe/raql/master/data/mfgplants.csv as the Data Source. 3. Enter plants as the Alias Name. 4. Then enter this query to see the first 20 rows of the result: select * from plants limit 20 5. Click Run. The results, showing the first 20 rows in this dataset display in a grid, something like this: As the results show, this contains a list of manufacturing plants, by country and name, along with latitude/longitude information and statistics on the production lines at each site. 

Description of Disclosure - DETX (245):
TABLE-US-00009 <mashup name="CSVFilter"xmlns="http://www.openmashup.org/schemas/v1.0/EMML"xmlns:m- acro="http://ww w.openmashup.org/schemas/v1.0/EMMLMacro"xmlns:presto="http://wwwjackbe.com- /v1.0/EMMLPres toExtensions"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openmashup.org/schemas/v1.0/EMML/../schemas- /EMMLPrestoSpec.x sd"><operationname="Invoke"><presto:presto-meta name="created-using">RUI</presto:presto meta> <presto:presto-meta name="alias">plants</presto:presto-meta><presto:presto-meta name="endpoint url">http://raw.github.com/jackbe/raql/master/data/mfgplants.csv</p

Description of Disclosure - DETX (258):
The following example mashup retrieves performance data for stocks from a URL using <directinvoke>. It uses a RAQL query to package all the data for storage with <raql> and includes stream=`true` to treat the query results as a stream. 

Description of Disclosure - DETX (310):
To load a dataset with a compatible data format: [0344] From a file, enter the file name in Data Source. The file must be accessible in the Mashup Server's classpath. This generates a <variable> statement in the mashup. [0345] From a URL using the GET method, enter the URL in Data Source. This generates a <directinvoke> statement in the mashup. [0346] Enter the Name to use for this dataset. Then enter a query and run it: [0347] Enter the RAQL query in the field below the data source. [0348] Click Run. The results for a successful query displays below in a grid. Or open a query from the query list, to run queries you have already saved, and click Run. 


Therefore, network addresses as well as data locations of data that address the data, defined by time or ranges, also reads as, address data defining data locations, as is common sense.

SEE Crupi also handles, Sources of different Formatted Data
Source 145 is a Database storage Format, being FIXED or static or stored data, that, “Doesn’t Change”, as understood comprises: historic data. SEE abstract

Last but not least, is a Third Source 141, source formats being, Undetermined Format or Formats, wherein the system, determines the format of sources 141, while the DB formatted Data 145, comprises a Schema which is deemed required to extract (query or pull, the data with respect to the sources, including stored data.

	SEE 0004, 0018, If Not Clear, teaches different as well as, determines Formats based on the location of the datasets, can be, since, the data source 1301 provides data 1303 which has an a priori known data format, therefore, determined priori to access, based on a correspondence with location of a dataset.

[0004] Referring now to FIG. 13, a flow chart illustrating a conventional procedure for generating analytics will be discussed and described. FIG. 13 is a representation of a conventional process for handling large data and using analytics on the data. A data source 1301 provides data 1303 which has an a priori known data format, such as from a stock market. A process to generate conventional analytics 1321 inputs, in step 1323, a defined model for the data, such as a model for stock market data. In step 1325, the process runs the data into the pre-determined model which is known to be appropriate for the data. In step 1327, a user manually prepares queries which can be run on data in the pre-determined model. The queries are run, and the query results are displayed to the user in step 1329. 

Sources of, hierarchical format, also is not deemed limited to a single format.

[0018] According to another embodiment, at least one of the static data and the real-time data is originated by the first or second sources respectively in a hierarchical format. 

	In conclusion the prior art is deemed to teach, as claimed, wherein the sources of different formatted data on a network, requires an address, as well as other parameters to access, process and extract desired data, as well as is deemed to teach as claimed, 
to determine based on the location of a respective dataset a storage format

The examiner possess a question, can stored data formatted to a DB, be accessed and processed (ETL), without any prior knowledge of, “the storage format or schema”, based on the prior art, as applied???
	
	It appears, all formats are determined based on a location of the respective dataset (Network STORED), prior to processing, since the Formats are taught, determined, “as priori”, appears is determined as well as required to perform any extraction with a query process (to Decode), in any ETL process, of any source of formatted data.
based on a stored location (of the stored data), of a respective dataset, a storage format, of the respective dataset (based on at least, the, “XML code”) or prior knowledge or even determining the format upon receiving is done prior, to perform the extraction (decoding), based on the data utilizing the schema (defined format), by capturing the properties of the stored data sources (of and FROM THE SOURCES), in order to process the data with SQL, as understood.

	The examiner is not persuaded by the amendments and corresponding arguments and had reapplied the prior art as Simitsis in view of Crupi, for clarity. 

The examiner suggests applicant to request an interview to discuss any potential distinguishable subject matter in an effort to enhance record clarity, as well as enhance compact prosecution, by defining common ground. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior 

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-4, 7-16 and 18-24 are rejected under 35 U.S.C. 103 as being unpatentable over Simitsis et al. (US 2014/0101092, FD 10/8/2012) in view of Crupi et al. (US 20140351233, FD 5/23/2014, w/provisional filings, 5/24/2013).
Regarding amended claim 1, Simitsis is deemed to teach as claimed, comprising: a method for interacting with a plurality of datasets, the method comprising:
using, by a device, a unified datastore (200), to interact with the plurality of datasets 

See Fig. 2, pane 202, w/INPUTs or sources and OUTPUTs

Allowing for integrated source input or loading, operations and output (or an ETL process), based on sequences, created by users, to be run and/or even Optimized (step 306), based on the Single Syntax.
As shown in Fig. 2 pane 200, is a user GUI, or a Unified Interface, associated with Fig. 1 (apparatus), with at least, panels of 202, 204 and tabs.
 
The panels are associated with TABS (206-214), defining at least, input, sources and output or a UNIFIED VIEW, of the plurality of datasets, allowing a user (w/GUI), to INTERACT, defining (Flow), edit, in, the unified datastore (or a FLOW, 206, 208, 210, 212 and 214 (SEE 0017, 0020), based on the single integrated syntax (or rules).

O	wherein:
the plurality of datasets (such as: Data sets, Output, multiple input sources, “…or 3 data sets…”), 
each, (Can Have or adapted to have), 
a plurality of different storage formats (Schema), and are (can be), stored in a plurality of different storage environments 

SEE Fig. 5, 0025, 051-509

Also consider, the background @ 0001 
“…Many organizations maintain heterogeneous systems of information technology infrastructure comprising assorted data formats originating from multiple sources…” 

	It appears One source can comprise, plural formats, originating from multiple sources.

In Fig. 2 illustrates code 204 includes extract LOAD TWEETFEED, w/ID, includes various parameters including names and location data, such as and including tweetday, generates, sentiment data associated with the input data storage formatted source defined by schema, as understood is required, IN ORDER TO EXTRACT THE DATA, from the source, based on SQL (210) and/or Map Reduce code (212).

SEE SQL code details, which, based on, “…a user clicking on structured query language (" SQL") tab 210 may cause the display SQL code in right panel 204. Such SQL code may be used to implement some operations in left panel 202…”

[0018] A user clicking on flow information tab 206 may cause meta-data associated with the specified operations to be shown in right panel 204. A click on flow information tab 206 may also cause other information to be shown, such as a graph representation of the sequence of operations. A user clicking on xLM tab 208 may cause customized extensible markup language ("XML") code to be displayed in right panel 204. Such code may represent the sequence of operations specified in left panel 202. The "xLM" code may capture information regarding data structures used to implement the sequence of operations (e.g., nodes and edges of a graph or hierarchical tree of interlinked nodes). The "xLM" code may also capture design meta-data (e.g., functional and non-functional requirements or resource allocation). In another example, the "xLM" code may capture operational properties (e.g., operation type, data schema, operation statistics, parameters or expressions for implementing an operation type, or execution environment details). A user clicking on structured query language (" SQL") tab 210 may cause the display of SQL code in right panel 204. Such SQL code may be used to implement some operations in left panel 202 as determined by optimizer module 116. A user clicking coordination tab 214 may cause the display of executable code in right panel 204 that coordinates each operation in the process displayed in left panel 202. Once the execution environments are selected, GUI 200 may show tabs that permit a user to view or edit the generated code executable therein.

Note in view of “Pigstore”, defines name parameters, and defining data, thereby generating tweetSentiment, w/filter parameter defining output format, from sources, in their source format, as well as defines the output formats (being Results), in at least, another format (Fig. 6), in view of an ETL process  model.



See Store … INTO … USING…, as formatted output data, from formatted input data, of a plurality of sources of formatted data in different formats, based on at least their Schema.

	Note input data can be any of: semi-structured or even referred to as, unstructured data or sources of, assorted data formats (0001), to process from multiple sources, wherein data warehouses provide Tools to perform ETL operations (see 0001), or sources of formatted data, in different storage formats.

Note, all stored data is deemed to be in, a storage format, when processed or encoded, generates another format, based on the encoding (being one of more steps), only requires to be converted, to any other form.

If not clear, please consider, 0018 (Operational Properties, including, “data schema” or “defining various 

Note, as understood, the conventional definition of SCHEMA, includes, 
“The database schema of a database is its structure, term schema refers, to the organization of data as a blueprint of how the database is constructed … with respect to tables in databases.”
	
Note sources are processed based on their native or given format (based on their Schema), including formatted in various Table Formats, allowing for conventional SQL query access to each source table (or DB), based on their format, defined by their schema.

 As understood, conventional SQL requires sources of structured data, in a database or table, is formatted defining defined by schema, to process, query to extract, the sources of input data (at the source), is based on their DB Schema (format), priori, enabling, the data structured to be decodable, as understood.


SEE below, “…the "xLM" code may capture operational properties (e.g., data schema…”.

Therefore the sources (inputs), are adapted to have different formats, allowing for data extraction (query), based on their Schema with, the structured query language, or code, thereby, accessing structured data based on their schema, defining different storage formats.

SEE Data Schema is of the operational properties and the SQL code, is directed to the operations, in left panel 202.
[0018] A user clicking on flow information tab 206 may cause meta-data associated with the specified operations to be shown in right panel 204. A click on flow information tab 206 may also cause other information to be shown, such as a graph representation of the sequence of operations. A user clicking on xLM tab 208 may cause customized extensible markup language ("XML") code to be displayed in right panel 204. Such code may represent the sequence of operations specified in left panel 202. The "xLM" code may capture information regarding data structures used to implement the sequence of operations (e.g., nodes and edges of a graph or hierarchical tree of interlinked nodes). The "xLM" code may also capture design meta-data (e.g., functional and non-functional requirements or resource allocation). In another example, the "xLM" code may capture operational properties (e.g., operation type, data schema, operation statistics, parameters or expressions for implementing an operation type, or execution environment details). A user SQL code in right panel 204. Such SQL code may be used to implement some operations in left panel 202 as determined by optimizer module 116. A user clicking coordination tab 214 may cause the display of executable code in right panel 204 that coordinates each operation in the process displayed in left panel 202. Once the execution environments are selected, GUI 200 may show tabs that permit a user to view or edit the generated code executable therein.


If not clear, the MAP reduce code 212, is processed with respect to the whole pane 202, while the SQL code (210), appears is directed, to one or more of the operations 1, 2, 3, 4 of pane 202.

SEE Fig. 5 and 0025, illustrates a Matrix or Table with details including, Operations 501-509 vs. Environments (506-508, e1-e3), wherein, “…Each cell flagged with an "X" may indicate that the execution environment corresponding to the column thereof is a candidate for executing the operation corresponding to a given row…”, defined in the matrix, allows for Cost analysis and pruning…, includes transitioning to another….

[0025] FIG. 5 shows an alternative representation that may be used to select an execution environment for each operation. In one example, the matrix data structure shown in FIG. 5 may be generated from a hierarchical tree of interlinked nodes after removing or "pruning" sections of the tree whose aggregate cost falls below or exceeds a predetermined threshold. In this example, each row 501-505 may represent an operation and each column 506-508 may represent a candidate execution environment. Each cell X" may indicate that the execution environment corresponding to the column thereof is a candidate for executing the operation corresponding to a given row. Each arrow projecting from the cell [e.sub.1, O.sub.1] may represent a cost of transitioning from an implementation of O.sub.1 in execution environment e.sub.1 to an implementation of O.sub.2 in another execution environment. The cost of transitioning from [e.sub.1, O.sub.1] to [e.sub.1, O.sub.2] is shown as infinity, since e.sub.1 is not a candidate for executing operation O.sub.2. However C.sub.1 may represent the cost of transitioning from [e.sub.1, O.sub.1] to [e.sub.2, O.sub.2] and C.sub.2 may represent the cost of transitioning from [e.sub.1, O.sub.1] to [e.sub.3, O.sub.2].

Therefore, based on the above teaches as claimed, each of the plurality of datasets includes input data in Fig. 2 and data sources, having one of the plurality of different storage formats (by schema) and being stored in one of the plurality of different storage environments 
SEE Fig. 5, 506-508, Map Reduce, is an environment

wherein, the unified datastore (SEE 0023, TREE with nodes, into which, data source may be LOADED), 

providing a single interface (Fig. 2, GUI, along with Fig. 4-5, and an Output at Fig. 6), for an application (Fig. 2), through which the plurality of datasets are accessible for processing by the application using a single syntax (based on Figs. 2 and 4). 
formatted Output Data, such as: in 10G ROWS based on, Time in seconds, based on formatted input data parameters, as defined by Time and Day, name etc…..

O	interacting (w/Single Interface 200), by the device and using the unified datastore (Figs. 2 & 4), with one or more of the plurality of datasets (Data Sources 406 and 404, to Nodes), wherein the interacting comprises:
receiving one or more parameters (0008, 0016-0021, 0024, 0027), 
via the single syntax (SEE Figs. 2, 4-5, 0023), wherein the one or more parameters comprise information identifying a location corresponding, input data for each respective dataset

SEE Location Data w/names and w/Times in Fig. 2, 204, and LOAD “tweetfeed” with IDs, USING Pigstore and Filtering BASED ON TWEED IDs, including, TWEET DAY USING, for each respective dataset (see 204) vs. Nodes in Fig. 4.

SEE Parameters (0018, 0022)
creating a plurality of datastore objects (see Fig. 4, to Nodes 404, 406), using the one or more parameters of the code, in Fig. 2, comprising wherein for each respective datastore object (Fig. 2 and 4), of the plurality of datastore objects: 

determining, based on a location of a respective dataset a storage format of the respective dataset, 
a storage environment of the respective dataset, or both

SEE generated Table in Figs. 5, w/storage environment, defined based on the sources, as well as the data formats, defined though Schema, also defined of and by, the sources.

SEE 0018, at least, Schema
 
In another example, the "xLM" code may capture operational properties (e.g., operation type, data schema, operation statistics, parameters or expressions for implementing an operation type, or execution environment details).




creating the respective datastore object, such that the respective datastore object is adapted to the storage format and the storage environment of the  respective dataset (in, Fig. 5), wherein:

SEE Environments including, at least the Map Reduce environment and other environments (see step 304 and Fig. 5, 506-508) vs. operations (501-505) vs. cost, wherein as understood, is as claimed, creating the respective datastore object, such that the respective datastore object is adapted to the storage format and the storage environment of the respective dataset that corresponds to the respective datastore object.

Simitsis further is deemed to teach further as claimed, the respective datastore object interfaces the input data of the respective dataset, and the respective datastore object includes a read method for reading a chunk

SEE 0026, 10 gigabyte rows

Wherein, the read method is adapted to the storage format and the storage environment of the respective dataset, and the chunk is a subset of the input data of the respective dataset.
rows of data” or a Chunk defined by, “10 gigabyte rows of data”

Therefore, Simitsis teaches, receiving a read command to read at least a portion of a first dataset of the plurality of datasets to nodes in Fig. 4, wherein the read command is specified using the single syntax and executing a first read method specified by the first datastore object to read a first chunk of first input data of the first dataset into memory associated with one or more first processors configured to process the first input data.

Simitsis fails to specifically state
O	a size of the chunk does not exceed, a memory size of the memory, associated with, the one or more processors

Crupi is deemed to render obvious as claimed, in view of “preventing overflow”, is part of (or parameter) of a read method, associated with a source, reads in Chunks of pre-determined set size, which, that does not exceed, a memory size of the memory (or Prevents Overflow), as understood, by a read method, including, evicting (least recent use, the LRU), or in other words, to make room for new datasets, prevents exceeding a 

Note, to Evict, to make room, allows reading of next chunks into available memory.

Additionally, specifically teaches, associated with, the in-memory, does not overflow to disk, if the datasets, exceed available memory.

This is understood, as first evicting for room, prior to storing, newly read chunks, prevents overflow to disk.

SEE Below @ 0567

[0567] In-Memory Dataset Management … The In-Memory Store also does not overflow to disk if the datasets you store exceed available memory. If the memory allocated to BigMemory is full, datasets are evicted based on least recent use to make room for new datasets. 

In view of the above, Crupi is deemed to teach as claimed, to set the size of the Chunk and chunking (0161, 0103), is part of the read method, as well as, the size (is set), that, does not exceed, a memory size, associated with preventing overflows, of the memory is associated with the read method, having a size setting, is associated with, the one or more processors.
does not exceed a memory size (set), of the memory associated with the one or more processors.

Note, chunking {such as: 1 million records and a filter operation} the system can perform reads in Chunks, wherein the Chunk = partition size (such as: 10,000 records), into buckets of 10,000 tuples.
[0103] With respect to "chunking," consider 1 million records and a filter operation. The system can read each record, … The chunk can be just one record. If the data is stored in-memory: the user can specify the chunk-size (=partition size), for example, 10,000 records. The system can read in and store the records in buckets of 10,000 tuples. The operation, for example, a filter operation, will be read in 10,000 chunks, operate on them, and then push them out to the client.

SEE Reading with a set Chunk size and chunking (0161, 0184), is part of, the Read method, being set, to a size that, does not exceed a memory size (set), by, evicting, based on least recent use, to make room for new datasets, prior to storing new chunks.

the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to, modify, Simitsis with the teachings of Crupi, including, to, set the size of the chunk, wherein the chunk size, does not exceed a memory size of the memory associated with the one or more processors, thereby, performing memory management including, preventing overflow, by, evicting chunks first, based on least recent use, to make room for new chunks, prior to reading in, next chunks, rendering obvious to set, “the size of the Chunk (as predetermined fixed size), that, does not exceed a memory size (prevent overflow including Loss of Data), of the memory associated with the one or more processors, wherein the read method includes the setting, based on, a preset, predetermined chunk size (see 10,000 records), appears simplifies memory management, by evict predetermined amounts space (10,000 records), making room, for next chunks of the same size, based on, a set predetermined fixed chunk size (parameter set), is of and directed at the read method, thereby preventing exceeding of memory size, thereby, preventing overflow, associated with the processors, as taught by Crupi, is also deemed obvious, can be applied, to storing data to the Nodes of Simitsis, as in Fig. 4, correspond to a read operation, thereby preventing exceeding, of a Node capacity (available space), associated with the node, associated with the one or more processors, as claimed.

Regarding claim 2, the combination as applied is deemed to render obvious, the method of claim 1, wherein the one or more parameters further comprise one or more of: 
a type of the input data (see Crupi, 0081, 0082, from XML or CSV or JSON, formats to, a tuple format), of the respective dataset, 
a format of the input data (SEE Crupi, 0004, source 1303, “has a priori known data format”, from the Stock Market, also 0018, 0112, from a input format to a Tuple format, 0162), of the respective dataset (see 0001, w/Schema), 
an offset (Crupi, offset, 0541- and Fig. 3, Window with Lead or Lag), for reading from the input data of the respective dataset, 
a size of the chunk 
See Crupi, 0103, 0161, 0184 sets chunk size = partition 
Simitsis 0026, 10 gigabyte rows), 
O	a condition for determining the chunk size (SEE Crupi, the size of set by a user or by default) 

a query for deriving the input data of the respective dataset (see Simitsis)

Simitsis as well as Crupi, teach as claimed 0015, 0021-, 0023, use a Query, comprises, a Type of Input (0091, 0112, 0150, 0153, 1070, 0402), and wherein query the (pulls) to derive (from somewhere), the input data from sources.

Regarding claim 3, the combination as applied is deemed to render obvious, wherein the one or more parameters further comprise one or more additional parameters, derived from the location of the input data, the one or more additional parameters comprising one or more of: 
O	a type, of the input data of the respective dataset, or
o	a format of the input data of the respective dataset

SEE Simitsis the Schema is deemed derived, from the location of the source of formatted data is, captured from “xLM” code, as well as Crupi.

SEE Simitsis, 0018, to derive (to receive or obtain from a source).
“…In another example, the "xLM" code may capture operational properties (e.g., operation type, data schema, operation 

Also see Crupi comprises a format of the source data or Type, is derived, as additional parameters from the location data (see claim 2), including, a format of the input data (SEE Crupi, 0004, source 1303, “has a priori known data format”, is from the location of the source of formatted data.

Regarding claim 4, the combination as applied is deemed to render obvious, wherein the size of the chunk is set, by the respective datastore object
SEE Crupi, 0103, chunk size, is set = partition size
Chunk Size set is predetermined, can be set by the User (0103, 0161 and 0184).
Simitsis is deemed also sets the read chunk size, 0026, chunk = 10 gigabyte rows

Regarding claim 7, the combination, as applied is deemed to render obvious, wherein each respective datastore object of the plurality of datastore objects further, 
Includes, one or more of: 
O	a reset method (see Crupi, 0547-0551, with RESET values), for resetting, a state of the respective 
and
a preview method for reading a preview subset (see Chunk or chunking and/or a Partition), of the input data of the respective dataset

SEE Crupi, appears does a preview read to validate by successful read, if not records are discarded (0103).
Also see Simitsis, also reads in Chunks (see above)
SEE 0026, chunk = 10 gigabyte rows

a data method for, determining whether all of the input data of the respective dataset has been read

SEE Crupi 0103 and, “read each record, see if it validates”, meets, determining whether all of the input data of the respective dataset has been read (successfully), if not successful read, the records are discarded
or,
a write data method to: receive an additional data and add the additional data to the input data of the respective dataset
SEE Simitsis (Fig. 2, 202)
REAL-TIME DATA (See Title).
See Crupi based on Fig. 3, 0080, Temporal Queries and 0283, and being Continuous, and 0013, 0015, 0023-, 0049, 0106 and Fig. 2, 211, also, see RALQ, at 0120 (Continuous Query), processing continuous stream data (0152, 0160, 0165, 0175)

Regarding claim 8, the combination as applied is deemed to render obvious, wherein the input data has a type of a plurality of data types (or Formats), and a type of the respective datastore object is determined based on the type of data type for the respective dataset
SEE Crupi, input data comprises plural types each type of plural types (from different sources), wherein the type is deemed determined based on the type of data type for respective data sets and input data Types (0050), including hierarchical Type, such as: XML (is semi-structured), 0059, can be, a hierarchical data (Type), instead of just tabular (Type), data as is handled by SQL.
SEE Crupi, comprises types being, including Stock market w/symbol (0004, 0125, and 0147), Spark Plug types and Analysis  of another Type, 0158.

SEE Simitsis 0023, “Each node in this first level (i.e., nodes 404 and 406) represents a candidate execution environment for storing the type of data corresponding to the first level.”

Regarding claim 9, the combination as applied is deemed to render obvious, wherein the plurality of data types include one or more of: 
a tabular text file (Crupi, 0050, 0059 or non-tabular 0065 and 0162, 0185), 
an SQL file (Crupi 0049, 0058-) and Simitsis (Fig. 2, SQL and Map Reduce), 
an image file 
or
o a key-value pair formatted file (see Tuples 0082)

SEE Crupi, Tuples includes, at least a key-value Pairs, with paired data including, Time stamp, name and Hash Map, inside (paired) with the Tuples, converted, from input formatted data (0151, 0112, JDBC or XML or LSON), to the Tuple format.

[0082] A tuple can be, as an example without being limited to, Java, a normalized Java object (which is the data model for the tuple is: time stamp, name value (for example, a hash map).

Regarding claim 10, the combination as applied is deemed to render obvious, wherein the input data of the respective dataset includes, a plurality of files
SEE Simitsis (0008, such as: Log Files)
SEE Crupi files include static files 0123, 0139, 0271, 0325
Regarding claim 11, the combination as applied is deemed to render obvious, wherein the information identifying the location of the input data, of the respective dataset includes an address, of the plurality of files
SEE ADDRESS, including Network Addresses, 
Crupi 0177 (Link), 0178 (IP), 0248 (URL) are Network Addresses, as well as, includes, locations or addresses including, time that addresses the data, wherein, queries can be run on a temporal basis, using the Time data, as address data, defining address locations of data (0067), by time as well as Network address to access the sources through query.
SEE 0049, 0067, Note, time data is stored in the memory, it is time stamped. Every record and every chunk of record can be time stamped. 
SEE Simitsis, accessed remote sources in view of Cloud (see above), with at least one, network address, as is conventional.

splitting the input data (by reading, in Chunks), of the respective dataset into a plurality of split sections (Chunking), wherein the respective datastore object distributes the plurality of split sections among a plurality of multi-processing nodes.
SEE Simitsis (see 0022 {w/servers}, 0175, Parallel), as applied (0023 and Fig. 4), shows Data Source Nodes (have been, split, 404, 406), corresponding to separate operations with separate paths having nodes (408— 430), in view of Fig. 2, note data sources per operations are separate (1 and 4), operation 2, connected to 1 and 3-4, an a path, each processing section is separate or split.
Also see Crupi (see Parallel, 0003 and Presto), as well as Simitsis, read in Chunks (see above), wherein Crupi 0301, does partitioning 0103, 0118, 0121, 0161, to chunks, or Chunking, as in preset fixed size, as well, as parallel processing.
	
Regarding claim 14, the combination as applied is deemed to render obvious, wherein at least one datastore object is used as an input (Simitsis, Fig. 2), to a MapReduce interface (see Map 
SEE Simitsis, Figs. 2, 3 (Map Reduce), 4

Regarding claim 15, the combination as applied is deemed to render obvious, wherein at least one datastore object is generated through a MapReduce system.
SEE Crupi 0128-, 0533-, “API uses a map/reduce paradigm”
And
Simitsis, Map Reduce is an execution environment (SEE abstract and at least Fig. 5)

Regarding claim 16, the combination as applied is deemed to render obvious, the size of the chunk is set based on a type of the input data of the respective dataset 
SEE Crupi, Stream of Row Data (or data Type) AND a maximum number of rows, “defined as the partition size”
SEE 507-
Description of Disclosure - DETX (596):
Datasets are streamed in sets of rows, with the maximum number of rows defined as the partition size. For more information on streaming partitions, see Stream Partitions. You set the partition size in when you store the dataset in the In-Memory Store. If no partition size is set, RAQL uses a default partition size of 10,000 rows. The following example sets the partition size for this dataset to 20,000 rows:

SEE 0600-, 0103, 0161, 0184, 0593 (FOR A STREAM, OR A TYPE)


Claims 18-24 (medium and system), are deemed analyzed and discussed with respect to the method claims above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 



Contact Information
Any inquiry concerning this communication or earlier communications should be directed to the examiner of record Vincent F. Boccio whose telephone number is (571) 272-7373.

The examiner can normally be reached on between Monday-Thursday between (8:30 AM to 5:00 PM).

The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Boris Gorney can be reached on (571) 270-5626.



For more information about the PAIR system: "http://portal.uspto.gov/external/portal/pair"

Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC)
866-217-9197 (toll-free)

If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/VINCENT F BOCCIO/     Primary Examiner, Art Unit 2158