DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 15-19 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 15-19 recite "the data system" or “the data analytics system” all in line 1. There is insufficient antecedent basis for this limitation in the claims. That is, the claim only mentions “a data analytics method” prior to this portion of the limitation.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 6-11, 13, 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Allan (US 2018/0052898) in view of Syed (US 2016/0205007) and further in view of Vasireddy (US 2020/0334270).
Regarding claim 1, Allan discloses:
A data analytics system comprising: at least one processor; and at least one non-transitory computer-readable medium containing instructions that, when executed by the at least one processor cause the data system to perform operations comprising: receiving, at a first storage location, input data at least by ([0130] “In accordance with an embodiment, each input HUB, e.g., HUB 111, can include a plurality of (source) datasets or entities 192” ‘2898 [0131] “In accordance with an embodiment, examples of input HUBs can include a database management system (DB, DBMS) 112 (e.g., an on-line transaction processing system (OLTP), business intelligence system, or an on-line analytical processing system (OLAP)).”  [0133] “input HUBs can include a data source into which data is received from, e.g., an Oracle Big Data Prep (BDP) service”);
configuring a flow service to execute a flow, flow execution comprising: creating a pipeline using the flow and metadata associated with the flow at least by ([0069] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data, referred to herein in some embodiments as HUBs. The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs” [0072] “the system can perform an ontology analysis of a schema definition, to determine the types of data, and datasets or entities, associated with that schema; and generate, or update, a model from a reference schema that includes an ontology defined based on relationships between datasets or entities, and their attributes. A reference HUB including one or more schemas can be used to analyze data flows, and further classify or make recommendations such as, for example, transformations enrichments, filtering, or cross-entity data fusion of an input data” [0301] “Metadata and Data-Driven Auto-Mapping “[0189] “In accordance with an embodiment, during the ingest step, in order to access and ingest the SFDC content, a HUB is created in the data lake to receive this content. This can be performed, for example, by selecting an SFDC adapter for the relevant access mode (JDBC, REST, SOAP), creating the HUB, providing a name, and defining an ingest policy which could be time based or as needed by the related data flows” [0302] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data (referred to herein in some embodiments as HUBs). The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity, or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs” [0322] “the auto-map service can be triggered by, for example, receiving a HTTP POST request from the system facade service. The system facade API passes the dataflow application, e.g., pipelines, Lambda application JSON file from UI to the auto-map REST API, and the parser module processes the application JSON file and extracts entity name and shape of dataset or entity including attribute names and data type.”),
the pipeline configured to perform a data transformation specified in the flow at least by ([0149] “In accordance with an embodiment, in the run-time, or operation mode, the policy and flow definitions created by the user are applied and/or executed. For example, such processing can include invoking ingest, transform, model and publish services, to process data in the pipeline” [0180] “As illustrated in FIG. 3, in accordance with an embodiment, the processing of a DFML data flow 260 can include a plurality of steps, including an ingest step 262, during which data is ingested from various sources, for example, Salesforce (SFDC), S3, or DBaaS.” [0181] “During a data preparation step 264, the ingested data0 can be prepared, for example by being de-duplicated, standardized, or enriched” [0182] “During a transform step 266, the system can perform one or more merges, filters, or lookups at datasets, to transform the data.” [0192] “In accordance with an embodiment, the next step is to define how the separate sources can be joined together around a central item, which is typically the basis (fact) for the analysis, and which can be accomplished by defining a dataflow pipeline. This can be done directly by creating a pipeline domain-specific language (DSL) script, or by using the guided editor where the user can see the effect on the data at each step and can take advantage of the recommendation service that suggests how the data could be, e.g., corrected, enriched, joined.” [0268] “As illustrated in FIG. 21, in accordance with an embodiment, a pipeline compiler 582 operates between design 570 and execution 580 environments, including accepting one or more pipeline metadata 572, and a DSL, e.g., Java DSL 574, JSON DSL 576, Scala DSL 578, and providing an output for use with the execution environment e.g., as a Spark application 584 and/or SQL statements 586” [0376] “In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows” [0456] “In accordance with an embodiment, subsequent data flows can be processed using the metadata in the system HUB after it is updated. Metadata analysis can be performed on a data flow of a dataflow application, e.g., pipeline, Lambda application”);
generating, using the pipeline, output data from the input data at least by ([0085] “Pipeline: In accordance with an embodiment, a declarative means of defining a processing pipeline, having a plurality of stages or semantic actions, each of which corresponds to a function such as, for example, one or more of filtering, joining, enriching, transforming, or fusion of an input data, for preparation as an output data.”).
Allan fails to disclose “determining a tenancy associated with the input data using the flow; and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Syed teaches the following limitations, determining a tenancy associated with the input data using the flow at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance”) and the extracting of identifying characteristics from the input data is the identifying of the identity of the tenant which is transmitted along with the data stream itself;
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Syed into the teaching of Allan because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Allan to further include the identifying of tenants using an identify service as in Syed which “improves the ability for systems to bill specific tenants with a greater accuracy, and/or target the most resource-intensive tenants for throttling” (Syed, [0040]).
Allan, Syed fail to disclose “and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Vasireddy teaches the above limitation at least by ([0108] “In accordance with an embodiment, the data pipeline or process maintains, for each of a plurality of customers (tenants), for example customer A 180, customer B 182, an analytic applications schema that is updated on a periodic or other basis, by the system in accordance with best practices for a particular analytics use case.” [0109] “For each of a plurality of customers (e.g., customers A, B), the system uses the analytic applications schema 162A, 162B, that is maintained and updated by the system, to pre-populate a data warehouse instance for the customer, based on an analysis of the data within that customer's enterprise applications environment 106A, 106B, and within each customer's tenancy (e.g., customer A tenancy 181, customer B tenancy 183); so that data is retrieved, by the data pipeline or process, from the customer's environment, and loaded to the customer's data warehouse instance 160A, 160B.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Vasireddy into the teaching of Allan, Syed because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the loading of data into tenant-specific databases but allowing a shared ETL server to orchestrate the ETL as in Vasireddy allows for “better optimization of resources” and “easier updates and maintenance of the ETL server/agents/repository as patching one instance of the ETL resources provides the patch to multiple tenants” (Vasireddy, [0151]).
As per claim 2, claim 1 is incorporated, Allan fails to disclose “wherein determining the tenancy comprises: extracting identifying characteristics from the input data; providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service”
However, Syed teaches the following limitations, wherein determining the tenancy comprises: extracting identifying characteristics from the input data at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance”) and the extracting of identifying characteristics from the input data is the identifying of the identity of the tenant which is transmitted along with the data stream itself;
providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Syed into the teaching of Allan because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Allan to further include the identifying of tenants using an identify service as in Syed which “improves the ability for systems to bill specific tenants with a greater accuracy, and/or target the most resource-intensive tenants for throttling” (Syed, [0040]).
As per claim 6, claim 1 is incorporated, Allan further discloses:
wherein: configuring a flow service to execute a flow comprises obtaining metadata associated with the flow at least by ([0072] “the system can perform an ontology analysis of a schema definition, to determine the types of data, and datasets or entities, associated with that schema; and generate, or update, a model from a reference schema that includes an ontology defined based on relationships between datasets or entities, and their attributes. A reference HUB including one or more schemas can be used to analyze data flows, and further classify or make recommendations such as, for example, transformations enrichments, filtering, or cross-entity data fusion of an input data” [0301] “Metadata and Data-Driven Auto-Mapping “ [0302] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data (referred to herein in some embodiments as HUBs). The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity, or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs”);
and wherein the execution of the flow depends on the obtained metadata at least by ([0149] “In accordance with an embodiment, in the run-time, or operation mode, the policy and flow definitions created by the user are applied and/or executed. For example, such processing can include invoking ingest, transform, model and publish services, to process data in the pipeline” [0180] “As illustrated in FIG. 3, in accordance with an embodiment, the processing of a DFML data flow 260 can include a plurality of steps, including an ingest step 262, during which data is ingested from various sources, for example, Salesforce (SFDC), S3, or DBaaS.” [0181] “During a data preparation step 264, the ingested data0 can be prepared, for example by being de-duplicated, standardized, or enriched” [0182] “During a transform step 266, the system can perform one or more merges, filters, or lookups at datasets, to transform the data.” [0192] “In accordance with an embodiment, the next step is to define how the separate sources can be joined together around a central item, which is typically the basis (fact) for the analysis, and which can be accomplished by defining a dataflow pipeline. This can be done directly by creating a pipeline domain-specific language (DSL) script, or by using the guided editor where the user can see the effect on the data at each step and can take advantage of the recommendation service that suggests how the data could be, e.g., corrected, enriched, joined.” [0268] “As illustrated in FIG. 21, in accordance with an embodiment, a pipeline compiler 582 operates between design 570 and execution 580 environments, including accepting one or more pipeline metadata 572, and a DSL, e.g., Java DSL 574, JSON DSL 576, Scala DSL 578, and providing an output for use with the execution environment e.g., as a Spark application 584 and/or SQL statements 586” [0376] “In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows” [0456] “In accordance with an embodiment, subsequent data flows can be processed using the metadata in the system HUB after it is updated. Metadata analysis can be performed on a data flow of a dataflow application, e.g., pipeline, Lambda application”).
As per claim 7, claim 6 is incorporated, Allan further discloses:
wherein: the metadata specifies a schema of the input data or rules for associating semantics with the input data at least by ([0072] “the system can perform an ontology analysis of a schema definition, to determine the types of data, and datasets or entities, associated with that schema; and generate, or update, a model from a reference schema that includes an ontology defined based on relationships between datasets or entities, and their attributes. A reference HUB including one or more schemas can be used to analyze data flows, and further classify or make recommendations such as, for example, transformations enrichments, filtering, or cross-entity data fusion of an input data” [0301] “Metadata and Data-Driven Auto-Mapping “ [0302] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data (referred to herein in some embodiments as HUBs). The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity, or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs”);
As per claim 8, claim 1 is incorporated, Allan further discloses:
wherein: configuring a flow service to execute a flow comprises obtaining an artifact implementing a data transformation; and executing the flow comprises executing the artifact at least by ([0071] “In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows”) and the artifact is the model describing transformations.
As per claim 9, claim 8 is incorporated, Allan further discloses:
wherein: the artifact comprises a script, executable binary, or module at least by ([0512] “In some embodiments, the present invention includes a computer program product which is a non-transitory computer readable storage medium (media) having instructions stored thereon/in which can be used to program a computer to perform any of the processes of the present invention”) and the model describing transformations is implemented using instructions (executable binary) stored within non-transitory computer readable medium.
As per claim 10, claim 1 is incorporated, Allan further discloses:
wherein: the flow comprises a JSON or YAML object at least by ([0069] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data, referred to herein in some embodiments as HUBs. The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs” [0189] “In accordance with an embodiment, during the ingest step, in order to access and ingest the SFDC content, a HUB is created in the data lake to receive this content. This can be performed, for example, by selecting an SFDC adapter for the relevant access mode (JDBC, REST, SOAP), creating the HUB, providing a name, and defining an ingest policy which could be time based or as needed by the related data flows” [0322] “the auto-map service can be triggered by, for example, receiving a HTTP POST request from the system facade service. The system facade API passes the dataflow application, e.g., pipelines, Lambda application JSON file from UI to the auto-map REST API, and the parser module processes the application JSON file and extracts entity name and shape of dataset or entity including attribute names and data type.”).
As per claim 11, claim 1 is incorporated, Allan further discloses:
wherein: the flow specifies that the output data can be accessed using at least one of GraphQL, SOAP, Odata, or OpenAPI at least by ([0069] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data, referred to herein in some embodiments as HUBs. The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs” [0189] “In accordance with an embodiment, during the ingest step, in order to access and ingest the SFDC content, a HUB is created in the data lake to receive this content. This can be performed, for example, by selecting an SFDC adapter for the relevant access mode (JDBC, REST, SOAP), creating the HUB, providing a name, and defining an ingest policy which could be time based or as needed by the related data flows” [0322] “the auto-map service can be triggered by, for example, receiving a HTTP POST request from the system facade service. The system facade API passes the dataflow application, e.g., pipelines, Lambda application JSON file from UI to the auto-map REST API, and the parser module processes the application JSON file and extracts entity
Regarding claim 13, Allan discloses:
A data analytics method comprising: receiving, at a first storage location, input data; configuring a flow service to execute a flow, flow execution comprising: obtaining metadata associated with the flow, the metadata specifying a schema of the input data or rules for associating semantics with the input data at least by ([0130] “In accordance with an embodiment, each input HUB, e.g., HUB 111, can include a plurality of (source) datasets or entities 192” ‘2898 [0131] “In accordance with an embodiment, examples of input HUBs can include a database management system (DB, DBMS) 112 (e.g., an on-line transaction processing system (OLTP), business intelligence system, or an on-line analytical processing system (OLAP)).”  [0133] “input HUBs can include a data source into which data is received from, e.g., an Oracle Big Data Prep (BDP) service”);
creating a pipeline using the flow and the metadata associated with the flow, the pipeline configured to perform a data transformation specified in the flow and execution of the flow dependent on the obtained metadata at least by ([0149] “In accordance with an embodiment, in the run-time, or operation mode, the policy and flow definitions created by the user are applied and/or executed. For example, such processing can include invoking ingest, transform, model and publish services, to process data in the pipeline” [0180] “As illustrated in FIG. 3, in accordance with an embodiment, the processing of a DFML data flow 260 can include a plurality of steps, including an ingest step 262, during which data is ingested from various sources, for example, Salesforce (SFDC), S3, or DBaaS.” [0181] “During a data preparation step 264, the ingested data0 can be prepared, for example by being de-duplicated, standardized, or enriched” [0182] “During a transform step 266, the system can perform one or more merges, filters, or lookups at datasets, to transform the data.” [0192] “In accordance with an embodiment, the next step is to define how the separate sources can be joined together around a central item, which is typically the basis (fact) for the analysis, and which can be accomplished by defining a dataflow pipeline. This can be done directly by creating a pipeline domain-specific language (DSL) script, or by using the guided editor where the user can see the effect on the data at each step and can take advantage of the recommendation service that suggests how the data could be, e.g., corrected, enriched, joined.” [0268] “As illustrated in FIG. 21, in accordance with an embodiment, a pipeline compiler 582 operates between design 570 and execution 580 environments, including accepting one or more pipeline metadata 572, and a DSL, e.g., Java DSL 574, JSON DSL 576, Scala DSL 578, and providing an output for use with the execution environment e.g., as a Spark application 584 and/or SQL statements 586” [0376] “In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows” [0456] “In accordance with an embodiment, subsequent data flows can be processed using the metadata in the system HUB after it is updated. Metadata analysis can be performed on a data flow of a dataflow application, e.g., pipeline, Lambda application”).
Allan fails to disclose “determining a tenancy associated with the input data using the flow, determining comprising: extracting identifying characteristics from the input data; providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service; generating, using the pipeline, output data from the input data; and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Syed teaches the following limitations, determining a tenancy associated with the input data using the flow, determining comprising: extracting identifying characteristics from the input data at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance”) and the extracting of identifying characteristics from the input data is the identifying of the identity of the tenant which is transmitted along with the data stream itself;
providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Syed into the teaching of Allan because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Allan to further include the identifying of tenants using an identify service as in Syed which “improves the ability for systems to bill specific tenants with a greater accuracy, and/or target the most resource-intensive tenants for throttling” (Syed, [0040]).
Allan, Syed fail to disclose “and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Vasireddy teaches the above limitation at least by ([0108] “In accordance with an embodiment, the data pipeline or process maintains, for each of a plurality of customers (tenants), for example customer A 180, customer B 182, an analytic applications schema that is updated on a periodic or other basis, by the system in accordance with best practices for a particular analytics use case.” [0109] “For each of a plurality of customers (e.g., customers A, B), the system uses the analytic applications schema 162A, 162B, that is maintained and updated by the system, to pre-populate a data warehouse instance for the customer, based on an analysis of the data within that customer's enterprise applications environment 106A, 106B, and within each customer's tenancy (e.g., customer A tenancy 181, customer B tenancy 183); so that data is retrieved, by the data pipeline or process, from the customer's environment, and loaded to the customer's data warehouse instance 160A, 160B.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Vasireddy into the teaching of Allan, Syed because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the loading of data into tenant-specific databases but allowing a shared ETL server to orchestrate the ETL as in Vasireddy allows for “better optimization of resources” and “easier updates and maintenance of the ETL server/agents/repository as patching one instance of the ETL resources provides the patch to multiple tenants” (Vasireddy, [0151]).
Regarding claim 20, Allan discloses:
A data analytics method comprising: receiving, at a first storage location, input data at least by ([0130] “In accordance with an embodiment, each input HUB, e.g., HUB 111, can include a plurality of (source) datasets or entities 192” ‘2898 [0131] “In accordance with an embodiment, examples of input HUBs can include a database management system (DB, DBMS) 112 (e.g., an on-line transaction processing system (OLTP), business intelligence system, or an on-line analytical processing system (OLAP)).”  [0133] “input HUBs can include a data source into which data is received from, e.g., an Oracle Big Data Prep (BDP) service”);
configuring a flow service to execute a flow, the flow comprises a JSON or YAML object and specifies that output data generated by the flow can be accessed using at least one of GraphQL, SOAP, Odata, or OpenAPI at least by ([0069] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data, referred to herein in some embodiments as HUBs. The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs” [0189] “In accordance with an embodiment, during the ingest step, in order to access and ingest the SFDC content, a HUB is created in the data lake to receive this content. This can be performed, for example, by selecting an SFDC adapter for the relevant access mode (JDBC, REST, SOAP), creating the HUB, providing a name, and defining an ingest policy which could be time based or as needed by the related data flows” [0322] “the auto-map service can be triggered by, for example, receiving a HTTP POST request from the system facade service. The system facade API passes the dataflow application, e.g., pipelines, Lambda application JSON file from UI to the auto-map REST API, and the parser module processes the application JSON file and extracts entity name and shape of dataset or entity including attribute names and data type.”),
flow execution comprising: obtaining metadata associated with the flow, the metadata specifying a schema of the input data or rules for associating semantics with the input data at least by ([0072] “the system can perform an ontology analysis of a schema definition, to determine the types of data, and datasets or entities, associated with that schema; and generate, or update, a model from a reference schema that includes an ontology defined based on relationships between datasets or entities, and their attributes. A reference HUB including one or more schemas can be used to analyze data flows, and further classify or make recommendations such as, for example, transformations enrichments, filtering, or cross-entity data fusion of an input data” [0301] “Metadata and Data-Driven Auto-Mapping “ [0302] “In accordance with an embodiment, the system can provide support for auto-mapping of complex data structures, datasets or entities, between one or more sources or targets of data (referred to herein in some embodiments as HUBs). The auto-mapping can be driven by a metadata, schema, and statistical profiling of a dataset; and used to map a source dataset or entity associated with an input HUB, to a target dataset or entity, or vice versa, to produce an output data prepared in a format or organization (projection) for use with one or more output HUBs”);
creating a pipeline using the flow and the metadata associated with the flow, the pipeline configured to perform a data transformation specified in the flow and execution of the flow dependent on the obtained metadata at least by ([0149] “In accordance with an embodiment, in the run-time, or operation mode, the policy and flow definitions created by the user are applied and/or executed. For example, such processing can include invoking ingest, transform, model and publish services, to process data in the pipeline” [0180] “As illustrated in FIG. 3, in accordance with an embodiment, the processing of a DFML data flow 260 can include a plurality of steps, including an ingest step 262, during which data is ingested from various sources, for example, Salesforce (SFDC), S3, or DBaaS.” [0181] “During a data preparation step 264, the ingested data0 can be prepared, for example by being de-duplicated, standardized, or enriched” [0182] “During a transform step 266, the system can perform one or more merges, filters, or lookups at datasets, to transform the data.” [0192] “In accordance with an embodiment, the next step is to define how the separate sources can be joined together around a central item, which is typically the basis (fact) for the analysis, and which can be accomplished by defining a dataflow pipeline. This can be done directly by creating a pipeline domain-specific language (DSL) script, or by using the guided editor where the user can see the effect on the data at each step and can take advantage of the recommendation service that suggests how the data could be, e.g., corrected, enriched, joined.” [0268] “As illustrated in FIG. 21, in accordance with an embodiment, a pipeline compiler 582 operates between design 570 and execution 580 environments, including accepting one or more pipeline metadata 572, and a DSL, e.g., Java DSL 574, JSON DSL 576, Scala DSL 578, and providing an output for use with the execution environment e.g., as a Spark application 584 and/or SQL statements 586” [0376] “In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows” [0456] “In accordance with an embodiment, subsequent data flows can be processed using the metadata in the system HUB after it is updated. Metadata analysis can be performed on a data flow of a dataflow application, e.g., pipeline, Lambda application”);
generating, using the pipeline, the output data from the input data at least by ([0085] “Pipeline: In accordance with an embodiment, a declarative means of defining a processing pipeline, having a plurality of stages or semantic actions, each of which corresponds to a function such as, for example, one or more of filtering, joining, enriching, transforming, or fusion of an input data, for preparation as an output data.”);
Allan fails to disclose “determining a tenancy associated with the input data using the flow, determining comprising: extracting identifying characteristics from the input data; providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service; and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Syed teaches the following limitations, determining a tenancy associated with the input data using the flow, determining comprising: extracting identifying characteristics from the input data at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance”) and the extracting of identifying characteristics from the input data is the identifying of the identity of the tenant which is transmitted along with the data stream itself;
providing the identifying characteristics to an identity service; and receiving an indication of the tenancy from the identity service at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance, and is saved in an appropriate database”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Syed into the teaching of Allan because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Allan to further include the identifying of tenants using an identify service as in Syed which “improves the ability for systems to bill specific tenants with a greater accuracy, and/or target the most resource-intensive tenants for throttling” (Syed, [0040]).
Allan, Syed fail to disclose “and storing, using the pipeline, the output data in a second storage location associated with the tenancy”
However, Vasireddy teaches the above limitation at least by ([0108] “In accordance with an embodiment, the data pipeline or process maintains, for each of a plurality of customers (tenants), for example customer A 180, customer B 182, an analytic applications schema that is updated on a periodic or other basis, by the system in accordance with best practices for a particular analytics use case.” [0109] “For each of a plurality of customers (e.g., customers A, B), the system uses the analytic applications schema 162A, 162B, that is maintained and updated by the system, to pre-populate a data warehouse instance for the customer, based on an analysis of the data within that customer's enterprise applications environment 106A, 106B, and within each customer's tenancy (e.g., customer A tenancy 181, customer B tenancy 183); so that data is retrieved, by the data pipeline or process, from the customer's environment, and loaded to the customer's data warehouse instance 160A, 160B.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Vasireddy into the teaching of Allan, Syed because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the loading of data into tenant-specific databases but allowing a shared ETL server to orchestrate the ETL as in Vasireddy allows for “better optimization of resources” and “easier updates and maintenance of the ETL server/agents/repository as patching one instance of the ETL resources provides the patch to multiple tenants” (Vasireddy, [0151]).
Claims 16, 17, 18, 19 recite similar claim limitations as the method of claims 8, 10, 11, 12, except that they set forth the claimed invention as a system, as such they are rejected for the same reasons as applied hereinabove.

Claims 3, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Allan (US 2018/0052898) in view of Syed (US 2016/0205007) and Vasireddy (US 2020/0334270) and further in view of Mehta (US 2013/0054648).
As per claim 3, claim 2 is incorporated, Allan, Syed, Vasireddy fail to disclose “wherein the operations further comprise: associating a tenancy object representing the output data with a parent object in a hierarchical data object ownership graph; and determining whether to authorize a request to display at least a portion of the output data based at least in part on the association between the tenancy object and the parent object”
However, Mehta teaches the following limitations at least by wherein the operations further comprise: associating a tenancy object representing the output data with a parent object in a hierarchical data object ownership graph at least by ([0027] “The object store 30 has multiple example fields. A “tenant ID” field indicates the organization in a multi-tenant database implementation, and can be removed in a single tenant database implementation. An “object ID” field can indicate a key that identifies a database object, uniquely or uniquely in combination with a tenant ID” [0066] “Step 86 determines if the user has permission to access the database object via a role hierarchy relationship between the user and the owner of the database object. A role hierarchy can be a treelike structure that indicates whether each entity in the role hierarchy has a supervisory relationship or a subordinate relationship with another entity in the role hierarchy.”) and the tree-structured role hierarchy is the hierarchal data object ownership graph.
and determining whether to authorize a request to display at least a portion of the output data based at least in part on the association between the tenancy object and the parent object at least by ([0066] “Step 86 determines if the user has permission to access the database object via a role hierarchy relationship between the user and the owner of the database object. A role hierarchy can be a treelike structure that indicates whether each entity in the role hierarchy has a supervisory relationship or a subordinate relationship with another entity in the role hierarchy. Users at any given role level can view, edit, and report on all data owned by or shared with users below them in the hierarchy, unless an organization's sharing model for an object specifies otherwise. In one implementation, an outer join is performed between the Activity objects and a flattened role hierarchy from the point of view of the user.” [0112] “Whether the user has permission to access a particular parent object of the data object, is based on the user being an owner of the particular parent object.” [0113] “A parent-to-child relationship between a particular parent object and the database object is characterized by a direction of inheritance of permission to access the database object from the particular parent object to the database object.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Mehta into the teaching of Allan, Syed, Vasireddy because the references similarly disclose database storage and processing. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include access control comprising accessing of objects by tenants based on hierarchical relationships as in Mehta which “defers the computational cost of the most difficult access paths until the exhaustion of alternatives to grant the user access”.
Claim 14 recites similar claim limitations as the system of claim 3, except that it sets forth the claimed invention as a method, as such it is rejected for the same reason as applied hereinabove.

Claims 4-5, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Allan (US 2018/0052898) in view of Syed (US 2016/0205007) and Vasireddy (US 2020/0334270) and further in view of Guerra (US 2014/0214753).
As per claim 4, claim 1 is incorporated, Allan fails to disclose “wherein: the tenancy is determined during execution of the pipeline… based on values of a specified column in the input data, the column specified in the flow”
However, Syed teaches wherein: the tenancy is determined during execution of the pipeline at least by ([0071] “The different stages in the data stream flow through stream filter 502, quantization engine 503, subscription engine 504 and billing interface 506 resolve the identity of tenant by using the services of tenant identity manager 901 and identify the application using application identity manager 1001.” [0079] “Each monitored data stream 801 is generally transmitted to metering system 1000 with an identity of the instance 1210, which is then identified by the zone identity manager 910. When the monitored data streams 801 arrive with application 1230 identity and tenant 1220 identity, to the metering system 1000, the metering is done for the appropriate tenant 1220 and the application 1230, as identified by the tenant identity manager 901 and application identity manager 1001. In other words, the data stream is labeled with the identity of the application invoked in the instance, and the identity of the tenant who invoked the application in the instance”) and the extracting of identifying characteristics from the input data is the identifying of the identity of the tenant which is transmitted along with the data stream itself.
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Syed into the teaching of Allan because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in Allan to further include the identifying of tenants using an identify service as in Syed which “improves the ability for systems to bill specific tenants with a greater accuracy, and/or target the most resource-intensive tenants for throttling” (Syed, [0040]).
Allan, Syed, Vasireddy fail to disclose “wherein: the tenancy is determined … based on values of a specified column in the input data, the column specified in the flow”
However, Guerra teaches the above limitation at least by ([0004] “The second stage ETL process can be associated with a single source, or a plurality of sources.” [0030] “Some of the embodiments disclosed here provide methods and systems to bring together the common elements between the various data sources and align them so that users can utilize one system for interacting with data from various data sources. For example, the methods and systems described in the present application can determine the table that contains customer information and the field that contains the customer number for each data source, and aligns them and stores them in a table that clearly identifies the customer table and the customer number field”) and the values of the specified column in the input data customer information and field containing the customer number for each data source of data that is inputted for ETL processing. That is, the customer information or customer number is the tenant ir tenant ID
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Guerra into the teaching of Allan, Syed, Vasireddy because the references similarly disclose ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the determining of tenancies based on specified column values of customer data as in Guerra for “improving the speed and efficiency of data warehouse operations” (Guerra, [0050]).
As per claim 5, claim 4 is incorporated, Allan, Syed fail to disclose “the pipeline generates multiple output datasets, each data set corresponding to one of the multiple values; and the pipeline stores each of the multiple output datasets in locations corresponding to differing tenancies; records in the input dataset have different values of the specified column”
However, Vasireddy teaches the following limitations, the pipeline generates multiple output datasets, each data set corresponding to one of the multiple values; and the pipeline stores each of the multiple output datasets in locations corresponding to differing tenancies at least by ([0108] “In accordance with an embodiment, the data pipeline or process maintains, for each of a plurality of customers (tenants), for example customer A 180, customer B 182, an analytic applications schema that is updated on a periodic or other basis, by the system in accordance with best practices for a particular analytics use case.” [0109] “For each of a plurality of customers (e.g., customers A, B), the system uses the analytic applications schema 162A, 162B, that is maintained and updated by the system, to pre-populate a data warehouse instance for the customer, based on an analysis of the data within that customer's enterprise applications environment 106A, 106B, and within each customer's tenancy (e.g., customer A tenancy 181, customer B tenancy 183); so that data is retrieved, by the data pipeline or process, from the customer's environment, and loaded to the customer's data warehouse instance 160A, 160B.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Vasireddy into the teaching of Allan, Syed because the references similarly disclose pipelining and ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the loading of data into tenant-specific databases but allowing a shared ETL server to orchestrate the ETL as in Vasireddy allows for “better optimization of resources” and “easier updates and maintenance of the ETL server/agents/repository as patching one instance of the ETL resources provides the patch to multiple tenants” (Vasireddy, [0151]).
Guerra further discloses:
records in the input dataset have different values of the specified column at least by ([0030] “Some of the embodiments disclosed here provide methods and systems to bring together the common elements between the various data sources and align them so that users can utilize one system for interacting with data from various data sources. For example, the methods and systems described in the present application can determine the table that contains customer information and the field that contains the customer number for each data source, and aligns them and stores them in a table that clearly identifies the customer table and the customer number field”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Guerra into the teaching of Allan, Syed, Vasireddy because the references similarly disclose ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the determining of tenancies based on specified column values of customer data as in Guerra for “improving the speed and efficiency of data warehouse operations” (Guerra, [0050]).

Claim 15 recites similar claim limitations as the system of claims 4 & 5, except that it sets forth the claimed invention as a method, as such it is rejected for the same reason as applied hereinabove.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Allan (US 2018/0052898) in view of Syed (US 2016/0205007) and Vasireddy (US 2020/0334270) and further in view of Ito (US 2021/0227021).
As per claim 12, claim 1 is incorporated, Allan, Syed, Vasireddy fail to disclose “wherein: an append-only data store includes the first storage location; and a data-lake includes the second location”
However, Ito teaches the following limitations, wherein: an append-only data store includes the first storage location at least by ([0040] “A database 121-1 stores append-only resources 141. The append-only resources 141 here are resources that are managed in a way that stored data itself is not updated.”);
and a data-lake includes the second location at least by ([0044] “Each second system 102 has a data acquisition module 151 and a big data processing platform 152 and manages a data lake 155.”).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to incorporate the teaching of Ito into the teaching of Allan, Syed, Vasireddy because the references similarly disclose ETL processes. Consequently, one of ordinary skill in the art would be motivated to further modify the system as in the combination of references to further include the appending only data stores and data lake as in Ito in order to improve processing and storage efficiency.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM P BARTLETT whose telephone number is (469)295-9085.  The examiner can normally be reached on M-Th 11:30-8:30, F 11-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 5712724046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WILLIAM P BARTLETT/
Examiner, Art Unit 2169

/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2169