Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1,5,7,10-13,15,18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200174966 A1 Szczepanik; Grzegorz P. et al. (hereinafter SZ) in view of US 20050086270 A1 Shimizu, Tomoyuki et al. (hereinafter Shimizu) and US 20160070739 A; Gukal; Sreenivas et al. (hereinafter Gukal) 
Regarding claim 1, SZ teaches A computer-implemented method comprising: receiving a request to transform records in a data lake that match one or more query criteria; retrieving from the data lake a plurality of data lake records that match the one or more query criteria, ( SZ [0026] The data lake system 101 may query the knowledge base 110 for records describing previously created and/or existing data lakes. The data lake system 101 may apply one or more analytics tools, cognitive learning techniques or algorithms, such as machine learning and/or data clustering, to ascertain which historical data lakes have implemented a data lake having the closest correlation with the files being received or stored by the data lake being created or registered and the optimal database engine 119 for reading, writing and updating the files. [0092] Upon creating a new operational database 123 that is being applied to a data lake, the database engine 119 and database repository 120, may be created and the plurality of data lake records including a first data lake record and a second data lake record, the first data lake record being associated with a designated data lake record identifier and a first timestamp identifying when the first data lake record was created, the second data lake record being associated with the designated data lake record identifier and a second timestamp identifying when the second data lake record was created; ( SZ [0082] Each file type and/or any associated metadata being analyzed may have a distinct, recognizable pattern of attributes within the metadata that may be helpful for categorizing the type of data stored by each file without having to process the entire file. For example, while analyzing the metadata of the files, the analysis module 112 can identify timestamp information associated with the file, image resolution, color depth, date of creation, geolocation, or other distinct information. Information such as this type of metadata may help categorize the file being analyzed by the analysis module 112 as an image or video file. [0104] In step 505 of algorithm 500, the analysis module 112 of the analytics engine 109 may analyze and parse through the metadata of each incoming file (whether embedded within the file itself or associated with the file as a separate metadata file). Examples of metadata may include descriptions or attributes about the incoming file (file type, author, date created, length, resolution, file size, etc.) metatags or keywords identifying themes of the file or words that may be referenced by the document repeatedly throughout, timestamps, file structures and other evidence that may help identify a type of file or category the file may be classified as, without having to fully process or extract the data from the file. The analysis module 112 may annotate or tag files with keywords or descriptors which may be used by the machine learning module 114 to categorize the file data or file type. [19,23,86-87] further elaborate on the use of metadata with the files) SZ teaches a plurality of files/records with a plurality of different and same timestamps/metadata										SZ lacks explicitly teaching based on a comparison of the first timestamp to the based on a comparison of the first timestamp to the second timestamp, generating a transformed data lake record based on the first data lake record; and transmitting the transformed data lake record to a downstream data service (Shimizu [0051] As shown in FIG. 1, the update notifying apparatus 1 is comprised of a notification target data storage section 101 that acquires a target data file the update of which is to be notified and past data files (time-series data files) thereof to manage and store these files, and recognizes a modification in the notification target data file, a time-series data identifying section 102 that manages information necessary for extracting a proper one of the past data files stored in the notification target data storage section 101, a detecting section 103 that compares two data files and determines whether or not a modification of the notification target data file satisfies a particular criterion (update criterion), an updated content extracting section 104 that extracts a content of the modification which is determined to satisfy the criterion, i.e. detected by the detecting section 103 , a notification content storage section 105 that stores updated contents extracted by the updated content extracting section 104, in a form compiled as information for notification, and a notifying section 106 that notifies the notification content stored in the notification content storage section 105 to an apparatus user to whom the update notifying apparatus 1 sends the update notification content   [0052] The notification target data storage section 101 may manage acquisition of the notification target data file and the past data files of the wherein the first and second timestamps are different  (Gukal [0035] “…The data structure that defines a batch of the partitioned log records, can include a batched log record identifier (e.g., batchedLogRecordlD) that uniquely identifies the batch relative to the other batches…a list of timestamps of the log records within the batch.” The unique identifier means they’re different timestamps.)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all  prior methods and make the addition of Gukal in order to help batch data together to create a more compressed and efficient system while maintaining an organized data structure with a plurality of timestamps (Gukal [0035] More generally, the log record batch processor 102 partitions (block 216) the log records into batches, where each of the batches is defined by a data structure that includes the template identifier and the attribute identifier for each of the log records within the batch. The data structure that defines a batch of the partitioned log records, can include a batched log record identifier (e.g., batchedLogRecordlD) that uniquely identifies the batch relative to the other batches, the 
Corresponding system claim 13 is rejected similarly as claim 1 above. Additional Limitations: Device with processor(s) and memory (SZ [FIG.6] Device with processor(s) and memory)
Corresponding product claim 18 is rejected similarly as claim 1 above. Additional Limitations: computer readable medium capable of reading and executing instructions 
Regarding claim 3, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein the designated data lake record identifier is associated with a third data lake record corresponding with an update request. (SZ [0027]  a data lake system 101 may be managing a plurality of different types of file types and data categories. Under such circumstances, more than one operational database 123 may be needed to manage the different file types and data categories and thus the data lake system 101 may consult the knowledge base 110 to identify an operational database 123 to manage each file type or category of data being stored based on one or more of the historical data lakes that have historically managed the same types of files and data [0082] Each file type and/or any associated metadata being analyzed may have a distinct, recognizable pattern of attributes within the metadata that may be helpful for categorizing the type of data stored by each file without having to process the entire file. For example, while analyzing the metadata of the files, the analysis module 112 can identify timestamp information associated with the file, image resolution, color depth, date of creation, geolocation, or other distinct information. Information such as this type of metadata may help categorize the file being analyzed by the analysis module 112 as an image or video file.[0087] identify common attributes of metadata between each of the files streamed to or stored by the raw data storage 117. Examples of unsupervised machine learning may include self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition [0092] Upon creating a new operational database 123 that 
Corresponding system claim 15 is rejected similarly as claim 3 above
Corresponding product claim 20 is rejected similarly as claim 3 above
Regarding claim 5, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein the designated data lake record identifier is associated with a third request to delete the first data lake record, and wherein the third data lake record is associated with a third timestamp ( SZ [0090] A database engine 119 may refer to an underlying software component or module that an operational database 123 may use to create, read, update and delete data from database records, which may be stored (for example, as tables) within the database repository 120. Embodiments of the database engine 119 may process the raw data stored as files in the raw data storage 117 upon request for further processing by a user, administrator and/or data scientist operating the data lake system 101. The database engine 119 may extract one or more attributes about the file from SZ teaches a plurality of files/records with a plurality of different and same timestamps/metadata
Regarding claim 7, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein a designated one of the data lake partition identifiers is associated with a pointer to a file in the data lake (SZ [0023] Embodiments of data lake systems 101 described herein may generate a file list describing each file being streamed to the data lake or stored by the data lake. One or more tools may be used to inspect and analyze the stream of incoming files and/or analyze each of the files currently stored by the data lake. In particular, the tools may analyze the files for metadata or separate metadata files which may be associated with the files being analyzed. The term "metadata" may refer to data that describes the file's data being streamed or stored by the data lake. The metadata may be embedded within the files being analyzed in some embodiments, or in other embodiments, the metadata may be ingested into the data lake as a separate metadata file which may 
Regarding claim 10, the combination of SZ, Gukal and Shimizu teach the computer-implemented method recited in claim 1, wherein the data lake records are stored in one or more third-party cloud computing storage system (SZ [0002] A data lake is a data-centered architecture featuring a repository capable of storing vast quantities of data in various formats. Data from webserver logs, databases, social media, and third-party data is ingested into the data lake. Curation takes place through capturing metadata, making the data available in a data catalog. A data lake can hold data in an unstructured manner. There may not be a hierarchy or organization to the storage of individual pieces of data ingested into the data lake. The data held by the data lake is not processed or analyzed upon ingestion into the data lake. Instead, a data lake accepts and retains data from a plurality of data sources, supports a broad array of data types and applies a schema to the data once the data is ready to be used.  [0041] Resource pooling: the provider's computing resources are pooled to serve multiple consumers using a multi -tenant model, with different physical and virtual resources dynamically assigned and reassigned according to demand. There is a sense of location independence in that the consumer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter).
Regarding claim 11, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein the data lake is accessible via an on-demand computing services environment providing computing services to a plurality of organizations via the internet (SZ  [0033] Referring to the drawings, FIGS. 1a-4 depict diagrams of a computing environment 100, 180, 190, 200, 280, 350, capable of recommending and/or provisioning an operational database 123 to a data lake, and/or sorting incoming files to a data lake having an operational database 123 best suited for managing the files, in accordance with the embodiments of the present disclosure. Embodiments of computing environment 100, 180, 190, 200, 280, 350 may include a plurality of computer systems and devices interconnected via a computer network 150, such as a data lake system 101a, 101b . . . 101n (referred herein referred to individually or plurality as "data lake system 101") , analytics system 130, a plurality of client devices 153a . . . 153n (hereinafter referenced collectively as "client device 153"), network accessible hardware such as a network accessible repository ... [36-40,67] further elaborate on the internet and computing services [FIG.3] shows a visual of the system connected to the internet, plurality of organizations/entities, and an on-demand environment)
Regarding claim 12, the combination of SZ, Gukal and Shimizu teach the computer-implemented method recited in claim 11, wherein the computing services environment includes a multitenant database that stores information associated with the plurality of organizations. (SZ  [0033] Referring to the drawings, FIGS. 1a-4 depict diagrams of a computing environment 100, 180, 190, 200, 280, 350, capable of recommending and/or provisioning an operational database 123 to a data lake, and/or sorting incoming files to a data lake having an operational database 123 best suited for managing the files, in accordance with the embodiments of the present disclosure. Embodiments of computing environment 100, 180, 190, 200, 280, 350 may 
Claims 2,4,14,16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200174966 A1 Szczepanik; Grzegorz P. et al. (hereinafter SZ) in view of US 20050086270 A1 Shimizu, Tomoyuki et al. (hereinafter Shimizu), US 20160070739 A; Gukal; Sreenivas et al. (hereinafter Gukal) and US 20060155945 A1; McGarvey; John Ryan (McGarvey)
Regarding claim 2, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein the comparison comprises 												The combination lacks explicitly teaching determining that the first timestamp determining that the first timestamp precedes the second timestamp ( McGarvey [0046] Returning again to step 506, if the original source timestamp does not equal the original target timestamp thus indicating that a replication conflict has occurred, a comparison of the source original timestamp, the target original timestamp and the source updated timestamp is made (step 510). Particularly, a comparison of the source original, source updated, and target original timestamps are made to determine if the target original timestamp is both greater than the source original timestamp and less than the source updated timestamp. If the target original timestamp is both greater than the source original timestamp and less than the source updated timestamp thus indicating that the source has the most recent version of the entry, the target returns a request for a "refresh" to be performed on the target by the source (step 512). On receipt of the refresh request, the source sends an Add command for the entire modified entry to the target (step 514), and generates a log record of the conflict between the source and target as well as a record of the refresh operation performed (step 516). On receipt of the add command, the target once again compares timestamps to determine if a change has occurred to the entry with a timestamp later than the refreshed record (step 517). Particularly, the target original timestamp is compared with the source updated timestamp. If the target original timestamp is less than the source updated timestamp, the add of the refreshed entry is rejected, and the replication routine exits according to step 530. If the target original timestamp is determined to be greater than the source updated timestamp at step 517, the target removes the corresponding entry from the data store at the target, adds the 
Corresponding system claim 14 is rejected similarly as claim 2 above
Corresponding product claim 19 is rejected similarly as claim 2 above
Regarding claim 4, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 3, wherein the third data lake record is associated with 									the combination lack explicitly teaching a third timestamp that precedes the second timestamp, and wherein the first timestamp precedes the third timestamp 		However McGarvey helps teach a third timestamp that precedes the second timestamp, and wherein the first timestamp precedes the third timestamp. ( McGarvey [0046] Returning again to step 506, if the original source timestamp does not equal the original target timestamp thus indicating that a replication conflict has occurred, a comparison of the source original timestamp, the target original timestamp and the source updated timestamp is made (step 510). Particularly, a comparison of the source original, source updated, and target original timestamps are made to determine if the target original timestamp is both greater than the source original timestamp and less than the source updated timestamp. If the target original timestamp is both greater than the source original timestamp and less than the source updated timestamp thus indicating that the source has the most recent version of the entry, the target returns a request for a "refresh" to be performed on the target by the source (step 512). On receipt of the refresh request, the source sends an Add command for the entire modified entry to the target (step 514), and generates a log record of the conflict SZ teaches a plurality of files/records with a plurality of different and same timestamps/metadata				Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of McGarvey in order to further facilitate data access and improve efficiency ( McGarvey   [0002] The present invention relates generally to an improved data processing system and in particular to a method and apparatus for resolving a replication conflict in a multi-mastered data processing system. [0004] In many data processing system environments, client applications must have uninterrupted read and write access to a directory data service. It such environments, it is advantageous if no single point of failure or network link outage may cause a loss of data access. To facilitate such data access, databases and other data stores are often replicated such that multiple data server replicas are accessible by clients. Replicas may be read-only 
Corresponding system claim 16 is rejected similarly as claim 4 above
Claims 2,4,14,16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200174966 A1 Szczepanik; Grzegorz P. et al. (hereinafter SZ) in view of US 20050086270 A1 Shimizu, Tomoyuki et al. (hereinafter Shimizu), US 20160070739 A; Gukal; Sreenivas et al. (hereinafter Gukal) and US 20150120656 A1; Ramnarayanan; Jagannathan et al. (hereinafter Ramn)
Regarding claim 6, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 5, wherein the first data lake record												The combination lack explicitly teaching is flagged for deletion after a designated period of time has elapsed after the third timestamp							However Ramn helps teach record is flagged for deletion after a designated period of time has elapsed after the third timestamp ( Ramn [0043] As described, metadata files that are created as a result of queue flushes, e.g., temporary metadata files 208(a), 208(b), and/or metadata files that are created as a result of compaction, e.g., 208(c), can be marked for deletion. For instance, temporary metadata files 208(a), 208(b) may be marked for deletion after minor compaction is performed, and the intermediate metadata files 208(c) or snapshot metadata files 208(d) can be marked for deletion after major compaction, after a threshold period of time has passed, or after all of the requested events identified in the metadata files have been performed by the system 100. Thus, a large number of expired metadata files, e.g., those metadata files that have been marked for deleted, may exist, such that deletion of the expired metadata files may be required or desirable. In some implementations, an expired metadata file can be deleted after the passing of a threshold period of time, e.g., 12 hours after it is marked as expired, or metadata files can be deleted periodically regardless of how long a particular metadata file has been expired, e.g., every 12 hours. In some implementations, the period of time with which expired metadata files can be removed from the distributed file system 106 can be configured by a user of the system 100, by a moderator of the system 100, or based on other criteria, e.g., based on the system 100 determining that it should remove one or more expired metadata files. In some implementations, expired metadata files can be deleted based on other conditions or operations, e.g., can be manually deleted by users of the system, can be deleted in response to determining a total size of expired metadata files stored at the distributed file system 106, or can be deleted based on detecting other conditions. [0085] if the minimum compaction setting specifies a total file size, existing operation log files may 
Corresponding system claim 17 is rejected similarly as claim 6 above		
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over US 20200174966 A1 Szczepanik; Grzegorz P. et al. (hereinafter SZ) in view of US 20050086270 A1 Shimizu, Tomoyuki et al. (hereinafter Shimizu), US 20160070739 A; Gukal; Sreenivas et al. (hereinafter Gukal) and Armbrust et al. Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores. PVLDB, 13(12): 3411-3424, 2020.DOI: https://doi.org/10.14778/3415478.3415560 (hereinafter Armbrust)
Regarding claim 8, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 7, wherein the pointer to the file is a partition key												the combination lack explicitly teaching a partition key in a Delta Lake change log table														However Armbrust helps teach a Delta Lake change log table (Armbrust [AB.] In this paper, we present Delta Lake, an open source ACID table storage layer over cloud object stores initially developed at Databricks. Delta Lake uses a transaction log 
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over US 20200174966 A1 Szczepanik; Grzegorz P. et al. (hereinafter SZ) in view of US 20050086270 A1 Shimizu, Tomoyuki et al. (hereinafter Shimizu), US 20160070739 A; 
Regarding claim 9, the combination of SZ, Gukal and Shimizu teach The computer-implemented method recited in claim 1, wherein the pointer to the file 		the combination lack explicitly teaching a URI independent of a file system underlying the data lake.											However Troy helps teach a URI independent of a file system underlying the data lake (Troy [0031] In some embodiments, the internet protocol (IP) address of the resource(s) within a VPC (e.g., the data lake 122, the data lake 132, the collection of cloud applications 126A-N, the collection of cloud processors 136A-N, and/or the SS 140) may change (a)periodically. Thus, a VPC can include a VPC DNS recursor, such as the VPC DNS recursor 124 for VPCDP1 120 and the VPC DNS recursor 134 for the VPCDP2 130, that can receive and query for DNS zone changes within the VPC, such as by determining an IP address for a unique private resource uniform resource identifier (URI) that is associated with access to one or more of the resources within and/or accessible via the VPC, such as the VPCDP1 120. In some instances, a VPC DNS recursor can provide the unique private resource URI to the PE of the data resource community 110 (e.g., the proxy application 214 of the PE A 210A). Because the IP address associated with the unique private resource URI may change a VPC DNS recursor, such as the VPC DNS recursor 124, may not release or broadcast the IP address associated with the unique private resource URI for the particular resource of the data resource community 110 to data partner enterprise networks (e.g., DPEN1 202A and DPEN2 202B) in order to maintain a federated security policy. Instead, the 
Response to Arguments
Applicant's arguments filed 12/20/2021 have been fully considered
35 USC § 103: 
Regarding Applicant’s Argument (page(s): 6-9):  Examiner’s response:- Applicant’s arguments, filed 12/20/2021, with respect to the rejection(s) of under 35 USC § 103  have been fully considered and are persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of US 20160070739 A; Gukal; Sreenivas et al. The examiner recommends further elaborating on the “designated data lake record identifiers”, the examiner encourages any language that helps differentiate “designated data lake record identifiers” from the broad interpretation of some sort of metadata identifying a section/partition of data. 

Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR E 136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARYAN D TOUGHIRY whose telephone number is (571)272-5212. The examiner can normally be reached Monday - Friday, 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARYAN D TOUGHIRY/Examiner, Art Unit 2165                                                                                                                                                                                                        
/William B Partridge/Primary Examiner, Art Unit 2183