DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1,4, 7-12, 15 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200050594 A1; Tidwell; Kenny et al. (hereinafter Tidwell) in view of Johnson; Theodore et al.; US 20180139118 A1 (hereinafter Johnson) and US 20210037112 A1; ANKIREDDYPALLE; Ramachandra Reddy et al. (hereinafter Ank).
Regarding claim 1, Tidwell teaches A computer-implemented method comprising: … a data lake storing a plurality of data lake records partitioned across a plurality of data lake partitions …a respective data lake partition identifier retrieving from a data lake update service a subset of the data lake partition identifiers (Tidwell [0036] Listener 120 is a component of the ECMS 102 that receives raw data streams and writes the raw data streams to a data lake 130. Listener 120 listens for raw data streams from many different data sources 105A-N. Listener 120 creates a separate raw data stream record in the data lake 130 for each data source, and writes the raw data stream from that data source 105A-N into the appropriate raw data stream record. Each raw data stream may be a constant or periodic stream of data. For example, some data streams may be sent once a day at a particular time. Other data streams may be sent as new data becomes available. Data streams may also be received at other regular or non-regular periodicity. [0054] Data stream writer 225 creates a new raw data stream record 245 in the data lake 130 to store the raw data stream 210 from the new data source. This may include issuing a command to data store interface 135 to cause the data store interface 135 to generate the raw data stream record 245 in data store 140. Data stream writer 225 includes the data source ID (and in some instances the determined source type) in the command, and the data source ID is included in a raw data stream record ID of the raw data stream record 245. In one embodiment, the raw data stream record ID for the raw data stream record includes the data source ID as a root and an identifier of the stream type. A raw data stream record may have the format "UUID-raw". For example, if the data source ID was "firewall2", then the raw data stream record ID may be "firewall2-raw". In some instances, the source type is also identified for the raw data stream record   [0065] Log separator 315 generates a corrected data stream 320 that includes the separated discrete log entries, and writes the corrected data stream 320 to a corrected data stream record 325 in the data lake 130. The corrected data stream record 325 in one embodiment contains the data source ID and a further identifier that indicates that the corrected data stream record contains discrete log entries. In one embodiment, the corrected data stream record 325 has a label of "UUID-single". For example, if the data source ID was "firewall2", then the ID for the corrected data stream record 325 may be "firewall2-single". [FIG.3] shows a visual of the system)						retrieving one or more of the data lake records via a communication interface using one or more queries that each include a respective one or more of the subset of the data lake partition identifiers; (Tidwell [0037] Data lake 130 is a large object-based data store 135 accompanied by a processing engine (data store interface 135) to operate on data in the data store 135. Data lake 130 may be capable of storing and operating on any type of data, regardless of a format of that data. Data lake 130 stores data such as raw data streams in a native format of the data. Examples of data lakes include Azure Data Lake.RTM., Kafka.RTM., Rabbit MQ.RTM., and Hadoop.RTM.. Data store interface 135 receives read and write requests, and performs reads to the data store 140 and writes from the data store 140 responsive to those read and write requests. For example, data store interface 135 may receive write requests from listener 120 to write messages containing log data of a raw data stream to a raw data stream record. Data store interface 135 may also respond to read and write requests from indexer 150.  [0059] Log separator 315 retrieves raw log data 305 from raw data stream records in the data lake 130. The raw log data 305 may be log data having an original format that the log data had when it was initially created, or close thereto. Alternatively, the raw log data may be log data that has been minimally modified (e.g., by tagging the log data with a source ID and/or a source type). The raw log data 305 may be retrieved by issuing read commands to data store interface 135 of the data lake 130. Responsive to receiving raw log data 305, log separator 315 determines whether the source type is known for the data source object associated with the raw log data 305. In one embodiment, log separator 315 determines the data source ID associated with the raw data stream record 245 that the raw log data 305 is retrieved from, and issues a query to the event data store 165 using the data source ID. The event data store 165 may then return the data source object 235 having the data source ID and/or may return specific information about the data source object 235 (e.g., a particular source type or an indication that the source type is unknown). Alternatively, the source type or unknown source type may be identified in metadata associated with the raw log data )												Tidwell lacks explicitly teaching a respective data lake partition being identified by a respective data lake partition identifier associated with a respective timestamp value identifying a respective time at which a record within the respective data lake partition was most recently updated; generating a transformed one or more records by applying a transformation function to the retrieved one or more records; and transmitting the transformed one or more records to a downstream data service, the transformed one or more records being associated with the designated period of time.					a respective data lake partition being identified by a respective data lake partition identifier associated with a respective timestamp value identifying a respective time at which a record within the respective data lake partition was most recently updated; the data lake partition identifiers have been already been taught/established by Tidwell above, Ank is being used to teach the concept that the metadata can include timestamp information of the most recent update (Ank [0085] Metadata can include ... the last modified time (e.g., the time of the most recent modification of the data object), a data object name ... creation date,...aging information (e.g., a schedule, such as a time period, in which the data object is migrated to secondary or long term storage) [0230] time-related factors (e.g., aging information such as time since the creation or modification of a data object); The snapshot manager 330 can create, at (4), a directory in one or more of the low speed drives 320 corresponding to a timestamp of the snapshot that was just taken. The snapshot manager 330 can then use the information provided by the file scanner 340 to identify the changed files ... The identification of the snapshot can be a timestamp at which a particular snapshot was taken, a name of the snapshot, etc. Here, the stub creator 350 may create one or more stubs for file F1 and one or more stubs for file F2. Once created, the stub creator 350 can transmit the stubs to the snapshot manager 330 at [327] further elaborates on the fact that you can have metadata which can have file ID and corresponding timestamp of update/creation)										generating a transformed one or more records by applying a transformation function to the retrieved one or more records; and transmitting the transformed one or more records to a downstream data service, the transformed one or more records being associated with the designated period of time (Ank [0301] After another period of time (e.g., as defined by the storage policy), the media agent 144 can convert the file F1 from the native format into a secondary copy format. The media agent 144 may stored the converted file F1 in the same location on the low speed drive(s) 320 or on a different location on the low speed drive(s) 320. Thus, the file F1 can be stored in a secondary copy format on the low speed drive(s) 320 at (4). At some later time (e.g., as defined by the storage policy), the media agent 144 can move the file F1 in the secondary copy format to one or more of the secondary storage devices 108, which can be local to the system 300 or located remotely from the system 300 and accessible via a network. )								Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all Tidwell's methods and make the addition of Ank in order to help add a method of changing according to a designated period of time and ultimately create a more efficient system (Ank [AB.] An improved information management system that implements a staging area or cache to temporarily store primary data in a native format before the primary data is converted into secondary copies in a secondary format is described herein. For example, the improved information management system can include various media agents that each include one or more high speed drives. When a client computing device provides primary data for conversion into secondary copies, the primary data can initially be stored in the native format in the high speed drive(s). If the client computing device then submits a request for the primary data, the media agent can simply retrieve the primary data from the high speed drive(s) and transmit the primary data to the client computing device. Because the primary data is already in the native format, no conversion operations are performed by the media agent, thereby reducing the restore delay. [297] Thus, when a client computing device 110 requests one or more files that have been moved to the low speed drive(s) 320A-C, the media agent 144A-C can access the stub(s) corresponding to the requested file(s) stored in the high speed drive(s) 310A-C to determine the location of the requested file(s). Once the location is determined, the media agent 144 can retrieve the requested file(s) from the low speed drive(s) 320A-C and transmit the files to the client computing device 110. Because the files stored in the low speed drive(s) 320A-C are still in the native format at this stage, the media agent 144 may not need to perform any format conversions, thereby speeding up the file retrieval process.)		the combination still lacks explicitly and orderly teaching receiving at a computing device having a processor and memory a transform request identifying a temporal checkpoint associated with a data lake storing a plurality of data lake records partitioned across a plurality of data lake partitions each identified by a respective data lake partition identifier; each associated with a respective timestamp value later than the temporal checkpoint, each of the subset of the data lake partition identifiers associated with a respective data lake partition including a respective data lake record updated after the temporal checkpoint										However Johnson helps teach receiving at a computing device having a processor and memory a transform request identifying a temporal checkpoint associated with a data lake storing a plurality of data lake records partitioned across a plurality of data lake partitions each identified by a respective data lake partition identifier; (Johnson [0003] In one example, the present disclosure discloses a device, method and computer-readable medium for recovering a replica in an operator in a data streaming processing system. A method may obtain a checkpoint in an input data stream, determine a maximum-timestamp at the checkpoint in the input data stream, calculate a completeness point that is greater than the maximum-timestamp for an output data stream and process data records from the checkpoint onwards that have a respective timestamp that is greater than or equal to the completeness point that was calculated to generate a new replica to replace a failed replica.[0062] In one embodiment, the method may include obtaining a checkpoint in an input data stream, determining a maximum-timestamp at the checkpoint in the input data stream, calculating a completeness point that is greater than the maximum-timestamp for an output data stream and processing data records from the checkpoint onwards that have a respective timestamp that is greater than or equal to the completeness point  [0075-0079] further elaborate)											each associated with a respective timestamp value later than the temporal checkpoint, each of the subset of the data lake partition identifiers associated with a respective data lake timestamp value later than the temporal checkpoint (Johnson [0003] In one example, the present disclosure discloses a device, method and computer-readable medium for recovering a replica in an operator in a data streaming processing system. A method may obtain a checkpoint in an input data stream, determine a maximum-timestamp at the checkpoint in the input data stream, calculate a completeness point that is greater than the maximum-timestamp for an output data stream and process data records from the checkpoint onwards that have a respective timestamp that is greater than or equal to the completeness point that was calculated to generate a new replica to replace a failed replica.[0062] In one embodiment, the method may include obtaining a checkpoint in an input data stream, determining a maximum-timestamp at the checkpoint in the input data stream, calculating a completeness point that is greater than the maximum-timestamp for an output data stream and processing data records from the checkpoint onwards that have a respective timestamp that is greater than or equal to the completeness point  [0075-0079] further elaborate on the checkpoint/maximum threshold point related corresponding records)			Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of Johnson in order to create a more accurate and efficient system via specified ways of analyzing and processing records in memory (Johnson [0046] Thus, the present disclosure provides a data stream processing system that processes a stream of records with a guarantee that each record is accounted for exactly once using replication. The present system is able to provide clean semantics that allows for code generation from high-level languages and query system optimization. Furthermore, the present system provides elastic scaling while also allowing for a great deal of flexibility for new data sources to be added or snapped into a stream (if their schemas match), and new applications can be quickly added to an existing stream system by having the application subscribe to the proper stream message queues [0075-0081] further elaborate)
Corresponding system claim 12 is rejected similarly as claim 1 above. Additional Limitations: Device with processor(s) and memory (Tidewell [FIG.13]Device with processor(s) and memory )
Corresponding product claim 18 is rejected similarly as claim 1 above. Additional Limitations: computer readable medium capable of reading and executing instructions (Tidewell [FIG.13] computer readable medium capable of reading and executing instructions )
Regarding claim 4, the combination of Tidwell, Johnson and Ank teach The method recited in claim 1, wherein a designated one of the data lake partition identifiers is associated with a pointer to a file in the data lake. (Ank [0173] Some types of snapshots do not actually create another physical copy of all the data as it existed at the particular point in time, but may simply create pointers that map files and directories to specific memory locations (e.g., to specific disk blocks) where the data resides as it existed at the particular point in time. For example, a snapshot copy may include a set of pointers derived from the file system or from an application. In some other cases, the snapshot may be created at the block-level, such that creation of the snapshot occurs without awareness of the file system. Each pointer points to a respective stored data block, so that collectively, the set of pointers reflect the storage location and state of the data object (e.g., file(s) or volume(s) or data set(s)) at the point in time when the snapshot copy was created. [0174] An initial snapshot may use only a small amount of disk space needed to record a mapping or other data structure representing or otherwise tracking the blocks that correspond to the current state of the file system. Additional disk space is usually required only when files and directories change later on. Furthermore, when files change, typically only the pointers which map to blocks are copied, not the blocks themselves. For example for "copy-on-write" snapshots, when a block changes in primary storage, the block is copied to secondary storage or cached in primary storage before the block is overwritten in primary storage, and the pointer to that block is changed to reflect the new location of that block. The snapshot mapping of file system data may also be updated to reflect the changed block(s) at that particular point in time.)
Corresponding system claim 15 is rejected similarly as claim 4 above
Regarding claim 7, the combination of Tidwell, Johnson and Ank teach The method recited in claim 1, wherein the data lake records are stored in one or more third-party cloud computing storage systems (Tidwell [0029] The various computing devices 115, 125, 145, 155, 170 may be connected via one or more networks, which may include a local area network (LAN), a wide area network (WAN) such as the Internet, and or a combination thereof. Additionally, computing devices 115 may be connected to one or more data sources 105A, 105B through 105N via one or more networks. Client computing devices 180 and/or third party computing devices 182 executing third party services 185 may be connected to computing devices 170 via one or more networks.   [0032] For some data sources 105A-N, the listener 120 periodically queries the data source 105A-N for the raw data stream containing the log data. For example, data source 105N may include an account of a third party service such Salesforce.com.RTM., DropBox.RTM., Box.RTM., and so on. In such an instance, listener 120 uses provided account credentials to log into an account of a customer and query the third party service for log data. [0035] In some instances, enterprises may be configured to collect log data for third party systems such as SIEMs. In such an embodiment, the enterprises may additionally send the log data to listener 120. Alternatively, or additionally, listener 120 may receive the log data directly from the SIEMs. Such log data may be received before and/or after the SIEMs operate on the log data. )
Regarding claim 8, the combination of Tidwell, Johnson and Ank teach The method recited in claim 1, wherein transmitting the transformed one or more records comprises writing the transformed one or more records to a database (Tidewell [0055] Once a data source object 235 and raw data stream record 245 have been generated for a particular data source 105A-N, new data in the raw data stream 210 from that data source 235 is written to the raw data stream record 245 associated with the data source object 235. To write a raw data stream 210 to the data lake 130, data stream writer 225 may issue a write command including at least one of an appropriate data source ID or raw data stream record ID to the data store interface 135. The data store interface 135 may then write the raw data stream 210 to the raw data stream record 245 having the raw data stream record ID that matches the received raw data stream record ID or that partially matches the received data source record ID. The data lake 130 may have many raw data stream records 245, where each raw data stream record 245 includes log data from a single data source 105A-N.  [0057] Listener 120 may send a notice to the indexer to wake the indexer and cause the indexer to begin processing log data in the raw data stream record 245 once that log data is written in the data lake 130. When data stream writer 225 writes data in a raw data stream 210 to the raw data stream record 245, data stream writer 245 may determine an amount of time that has passed since log data was previously written to the raw data stream record 245. If more than a threshold amount of time has passed (e.g., 10 minutes, 4 hours, 1 day, etc.), then listener 120 may send the notice to the indexer. In one embodiment, the data lake 130 includes a notice data stream record, and the notice is sent to the indexer by writing the notice to the notice data stream record. The notice may indicate the raw data stream record 245 that contains data to be processed. The indexer may periodically or continuously check the notice data stream record. )
Regarding claim 9, the combination of Tidwell, Johnson and Ank teach The method recited in claim 1, the method comprising: updating the data lake service to identify a time checkpoint associated with the transformed one or more records (Ank [0085] Metadata can include, without limitation, one or more of the following: the data owner (e.g., the client or user that generates the data), the last modified time (e.g., the time of the most recent modification of the data object), a data object name (e.g., a file name), a data object size (e.g., a number of bytes of data), information about the content (e.g., an indication as to the existence of a particular search term), user-supplied tags, to/from information for email (e.g., an email sender, recipient, etc.), creation date, file type (e.g., format or application type), last accessed time, application type (e.g., type of application that generated the data object), location/network (e.g., a current, past or future location of the data object and network pathways to/from the data object), geographic location (e.g., GPS coordinates), frequency of change (e.g., a period in which the data object is modified), business unit (e.g., a group or department that generates, manages or is otherwise associated with the data object), aging information (e.g., a schedule, such as a time period, in which the data object is migrated to secondary or long term storage), boot sectors, partition layouts, file location within a file folder directory structure, user permissions, owners, groups, access control lists (ACLs), system metadata (e.g., registry information), combinations of the same or other similar information related to the data object. In addition to metadata generated by or related to file systems and operating systems, some applications 110 and/or other components of system 100 maintain indices of metadata for data objects [0229] frequency with which primary data 112 or a secondary copy 116 of a data object or metadata has been or is predicted to be used, accessed, or modified; [0230] time-related factors (e.g., aging information such as time since the creation or modification of a data object); [0231] deduplication information (e.g., hashes, data blocks, deduplication block size, deduplication efficiency or other metrics); [0232] an estimated or historic usage or cost associated with different components (e.g., with secondary storage devices 108); [0233] the identity of users, applications 110, client computing devices 102 and/or other computing devices that created, accessed, modified, or otherwise utilized primary data 112 or secondary copies 116; [0234] a relative sensitivity (e.g., confidentiality, importance) of a data object, e.g., as determined by its content and/or metadata;)
Regarding claim 10, the combination of Tidwell, Johnson and Ank teach The method recited in claim 1, wherein the data lake is accessible via an on-demand computing services environment providing computing services to a plurality of organizations via the internet ( Tidewell [0029] The various computing devices 115, 125, 145, 155, 170 may be connected via one or more networks, which may include a local area network (LAN), a wide area network (WAN) such as the Internet, and or a combination thereof. Additionally, computing devices 115 may be connected to one or more data sources 105A, 105B through 105N via one or more networks. Client computing devices 180 and/or third party computing devices 182 executing third party services 185 may be connected to computing devices 170 via one or more networks. [0030] Data sources 105A-N are providers of raw data streams of log data. Data sources 105A-N may be devices in an enterprise environment (e.g., on a network of an enterprise) that produce log data. Examples of such devices include computing devices (e.g., server computing devices) that generate system logs, firewalls, routers, identity management systems, switches, and so on. Data sources 105A-N may also include applications, services, modules, etc. that generate log data. The log data in the raw data streams may differ between data sources 105A-N. Examples of log data formats include Syslog messages, simple network management protocol (SNMP) logs, reports from devices and/or applications running on devices, application programming interface (API) call records, information exchange protocols, remote authentication dial-in user service (RADIUS) logs, lightweight directory access protocol (LDAP) logs, security assertion markup language (SAML) messages [FIG.1 & 3] show a visual of the system)
Regarding claim 11, the combination of Tidwell, Johnson and Ank teach The method recited in claim 10, wherein the computing services environment includes a multitenant database that stores information associated with the plurality of organizations ( Tidewell [0029] The various computing devices 115, 125, 145, 155, 170 may be connected via one or more networks, which may include a local area network (LAN), a wide area network (WAN) such as the Internet, and or a combination thereof. Additionally, computing devices 115 may be connected to one or more data sources 105A, 105B through 105N via one or more networks. Client computing devices 180 and/or third party computing devices 182 executing third party services 185 may be connected to computing devices 170 via one or more networks. [0030] Data sources 105A-N are providers of raw data streams of log data. Data sources 105A-N may be devices in an enterprise environment (e.g., on a network of an enterprise) that produce log data. Examples of such devices include computing devices (e.g., server computing devices) that generate system logs, firewalls, routers, identity management systems, switches, and so on. Data sources 105A-N may also include applications, services, modules, etc. that generate log data. The log data in the raw data streams may differ between data sources 105A-N. Examples of log data formats include Syslog messages, simple network management protocol (SNMP) logs, reports from devices and/or applications running on devices, application programming interface (API) call records, information exchange protocols, remote authentication dial-in user service (RADIUS) logs, lightweight directory access protocol (LDAP) logs, security assertion markup language (SAML) messages [FIG.1 & 3] show a visual of the system)
Claims 5 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200050594 A1; Tidwell; Kenny et al. (hereinafter Tidwell) in view of Johnson; Theodore et al.; US 20180139118 A1 (hereinafter Johnson) and US 20210037112 A1; ANKIREDDYPALLE; Ramachandra Reddy et al. (hereinafter Ank) and Armbrust et al. Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores. PVLDB, 13(12): 3411-3424, 2020.DOI: https://doi.org/10.14778/3415478.3415560 (hereinafter Armbrust)
Regarding claim 5, the combination of Tidwell and Ank teach The method recited in claim 4, wherein the pointer to the file is a partition key in (Ank [0173] Some types of snapshots do not actually create another physical copy of all the data as it existed at the particular point in time, but may simply create pointers that map files and directories to specific memory locations (e.g., to specific disk blocks) where the data resides as it existed at the particular point in time. For example, a snapshot copy may include a set of pointers derived from the file system or from an application. In some other cases, the snapshot may be created at the block-level, such that creation of the snapshot occurs without awareness of the file system. Each pointer points to a respective stored data block, so that collectively, the set of pointers reflect the storage location and state of the data object (e.g., file(s) or volume(s) or data set(s)) at the point in time when the snapshot copy was created. [0174] An initial snapshot may use only a small amount of disk space needed to record a mapping or other data structure representing or otherwise tracking the blocks that correspond to the current state of the file system. Additional disk space is usually required only when files and directories change later on. Furthermore, when files change, typically only the pointers which map to blocks are copied, not the blocks themselves. For example for "copy-on-write" snapshots, when a block changes in primary storage, the block is copied to secondary storage or cached in primary storage before the block is overwritten in primary storage, and the pointer to that block is changed to reflect the new location of that block. The snapshot mapping of file system data may also be updated to reflect the changed block(s) at that particular point in time.)								the combination lack explicitly teaching a Delta Lake change log table			However Armbrust helps teach a Delta Lake change log table (Armbrust [AB.] In this paper, we present Delta Lake, an open source ACID table storage layer over cloud object stores initially developed at Databricks. Delta Lake uses a transaction log that is compacted into Apache Parquet format to provide ACID properties, time travel, and significantly faster metadata operations for large tabular datasets (e.g., the ability to quickly search billions of table partitions for those relevant to a query). It also leverages this design to provide high-level features such as automatic data layout optimization, upserts, caching, and audit logs. Delta Lake tables can be accessed from Apache Spark, Hive, Presto, Redshift and other systems [Page 3412, col 1] elaborates on the integration of Delta lakes [3414, col 2] Shows the delta lake use in tables/transactional logs)														Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of Armbrust in order to help create a more efficient system  (Armbrust [AB.] In this paper, we present Delta Lake, an open source ACID table storage layer over cloud object stores initially developed at Databricks. Delta Lake uses a transaction log that is compacted into Apache Parquet format to provide ACID properties, time travel, and significantly faster metadata operations for large tabular datasets (e.g., the ability to quickly search billions of table partitions for those relevant to a query). It also leverages this design to provide high-level features such as automatic data layout optimization, upserts, caching, and audit logs. Delta Lake tables can be accessed from Apache Spark, Hive, Presto, Redshift and other systems [page. 3414, col. 1] efficient metadata storage [Page. 3420, col.1-2] further elaborate on the efficiency)
Corresponding system claim 16 is rejected similarly as claim 5 above
Claims 6 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200050594 A1; Tidwell; Kenny et al. (hereinafter Tidwell) in view of Johnson; Theodore et al.; US 20180139118 A1 (hereinafter Johnson), US 20210037112 A1; ANKIREDDYPALLE; Ramachandra Reddy et al. (hereinafter Ank),  Armbrust et al. Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores. PVLDB, 13(12): 3411-3424, 2020.DOI: https://doi.org/10.14778/3415478.3415560 (hereinafter Armbrust) and US 20190132280 A1; Meuninck; Troy et al (hereinafter Troy)
Regarding claim 6, the combination of Tidwell, Armbrust, Johnson and Ank teach The method recited in claim 4, wherein the pointer to the file is 					the combination lack explicitly teaching a URI independent of a file system underlying the data lake 											However Troy helps teach a URI independent of a file system underlying the data lake (Troy [0031] In some embodiments, the internet protocol (IP) address of the resource(s) within a VPC (e.g., the data lake 122, the data lake 132, the collection of cloud applications 126A-N, the collection of cloud processors 136A-N, and/or the SS 140) may change (a)periodically. Thus, a VPC can include a VPC DNS recursor, such as the VPC DNS recursor 124 for VPCDP1 120 and the VPC DNS recursor 134 for the VPCDP2 130, that can receive and query for DNS zone changes within the VPC, such as by determining an IP address for a unique private resource uniform resource identifier (URI) that is associated with access to one or more of the resources within and/or accessible via the VPC, such as the VPCDP1 120. In some instances, a VPC DNS recursor can provide the unique private resource URI to the PE of the data resource community 110 (e.g., the proxy application 214 of the PE A 210A). Because the IP address associated with the unique private resource URI may change a VPC DNS recursor, such as the VPC DNS recursor 124, may not release or broadcast the IP address associated with the unique private resource URI for the particular resource of the data resource community 110 to data partner enterprise networks (e.g., DPEN1 202A and DPEN2 202B) in order to maintain a federated security policy. Instead, the provider edge (e.g., PE A 210A) of the data resource community 110 can advertise or otherwise provide a BGP update message informing the data partner enterprise networks (via the private network 102 connected to their respective provider edge) of application connectivity information associated with the available digital resource of the data resource community 110.)										Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of Troy in order to improve the overall functionality of the system by allowing to utilize URI in a system utilizing Data Lakes (Troy [0031] In some embodiments, the internet protocol (IP) address of the resource(s) within a VPC (e.g., the data lake 122, the data lake 132, the collection of cloud applications 126A-N, the collection of cloud processors 136A-N, and/or the SS 140) may change (a)periodically. Thus, a VPC can include a VPC DNS recursor, such as the VPC DNS recursor 124 for VPCDP1 120 and the VPC DNS recursor 134 for the VPCDP2 130, that can receive and query for DNS zone changes within the VPC, such as by determining an IP address for a unique private resource uniform resource identifier (URI) that is associated with access to one or more of the resources within and/or accessible via the VPC, such as the VPCDP1 120. In some instances, a VPC DNS recursor can provide the unique private resource URI to the PE of the data resource community 110 (e.g., the proxy application 214 of the PE A 210A). Because the IP address associated with the unique private resource URI may change a VPC DNS recursor, such as the VPC DNS recursor 124, may not release or broadcast the IP address associated with the unique private resource URI for the particular resource of the data resource community 110 to data partner enterprise networks (e.g., DPEN1 202A and DPEN2 202B) in order to maintain a federated security policy. Instead, the provider edge (e.g., PE A 210A) of the data resource community 110 can advertise or otherwise provide a BGP update message informing the data partner enterprise networks (via the private network 102 connected to their respective provider edge) of application connectivity information associated with the available digital resource of the data resource community 110.)
Corresponding system claim 17 is rejected similarly as claim 6 above
Response to Arguments
Applicant's arguments filed 5/19/2022 have been fully considered
35 USC § 103: 
Regarding Applicant’s Argument (page(s): 6-8): Examiner’s response:- It is important to note that this rejection is one of obviousness and not one of anticipation, hence elements from one art can be combined into a foundation of another separate art. Here Tidewell establishes the use of data lakes, Ank is used in addition of Tidwell in order to help show metadata describing data/files which also includes timestamps/time of creation/updating. The current scope of the claim is not interpreted by the examiner as the applicant argues it should be viewed in the arguments, the examiner believes the applicant is assuming and placing too much weight from instant applications specification. The examiner believes these limitations assumed from the specification are not clear and must be brought into the claim’s limitations for the claim to gain the scope the applicant wishes it to have. For example the term data lakes is read broadly as some sort of a database that can handle different formats of data. The examiner believes the current art can be overcome via more details on the data lakes and more details on the steps that describe these partitioning/timestamp tracking methods are tailored towards data lakes.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARYAN D TOUGHIRY whose telephone number is (571)272-5212. The examiner can normally be reached Monday - Friday, 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571) 270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARYAN D TOUGHIRY/Examiner, Art Unit 2165                                                                                                                                                                                                        
/William B Partridge/Primary Examiner, Art Unit 2183