DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
This action is in response to remarks and amendments submitted on 10/22/2020, in which claims 1-20 were presented for further examination. The applicant’s remarks and amendments to the claims were considered with the following results:
In response to the last Office Action: 
Claims 1, 5-9, and 16-18 are currently amended. 
No claims are currently cancelled.
Claims 1-20 are pending.
The previous 35 USC § 101 rejection, citing signal per se, has been withdrawn –as necessitated by applicant’s amendments to the impacted claims.
The previous 35 USC § 112(b) rejection has been withdrawn –as necessitated by applicant’s amendment to the impacted claims. However, a new 35 USC § 112(b) rejection has been issued below. 

Response to Arguments
The applicant’s remarks and/or arguments, filed 10/22/2020 with respect to claim(s) 1-20, have been fully considered. 



Applicant asserts Etgen (US PGPub 20020184570) and Schacht (US PGPub 20130173691) considered alone or in a combination does not describe all the features of amended claim 1 (claims 5 and 9 recite similar limitations). Amended claim 1 reads as follows: 
“... comparing, at the database system, record counts of the first set of log files with record counts of the second set of log files; 
ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files; 
determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts; and 
that achieves a target completeness value.”

The examiner notes the applicant’s amendments and arguments are not persuasive as the combination of prior arts are reasonable, as well as relevant, for teaching the combination of elements disclosed within the impacted claims. The examiner reminds applicant to focus on the combination of teachings in both the prior arts of record. 
The examiner notes comparing the log entries in Etgen is analogous to the comparison disclosed in the instant application. The examiner notes Etgen, ¶ [0018], discloses storing data in logs. Etgen, ¶ [0027], discloses the analysis may examine time segments within the log to determine whether time gaps between data are sufficient to indicate that a loss of data has occurred. This analysis may include comparing the log to prior logs to determine whether any deviations in the current log are sufficient to warrant an indication that a data loss has occurred. Etgen, ¶ [0040], discloses the process begins with a specification of time segments and time gap tolerance. The specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. Etgen, ¶ [0053], discloses the log analyzer process may begin to process new server logs as they are provided.
The examiner notes time segments within the log are examined. This indicates each log is further broken down in to subsets of log files, based on time segments, and compared to the same in prior log files to determine whether 
Further, the examiner notes De Schacht was introduced to disclose the ranking features associated with the claimed invention. De Schacht, ¶ [0018, 0020], discloses choosing the best-candidate file from a plurality of log servers. The best-candidate file is chosen based on not having lost more than x percent of application messages. De Schacht, ¶ [0054, 0055], discloses the best-candidate file having lost application messages and not having lost more than x percent of application messages for the interval is augmented by the lost application messages existing in other files of the set of files, x being predetermined. In one embodiment, x is comprised between fifteen and forty five … upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing.
The examiner notes selecting a “best-candidate” is indicative of a ranking scheme when evaluated further based on the percent of messages lost. Further, De Schacht similarly use time periods or intervals (as in Etgen) to compare the respective log files, and in the end, choose the best candidate to be sent for further processing. 



It should be noted that any citations to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claims 1, 5, and 9 disclose “…comparing, at the database system, record counts of the first set of log files with record counts of the second set of log files; … ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files…”. The examiner notes the preceding limitation discloses comparing record counts of the first set of log files with record counts of the second set of log files. This limitation appears to compare overall counts when collectively comparing one set to the other. Then, the subsequent ranking limitation appears to rank individual log files within the second set to respective individual log files with the first set. The examiner notes the ranking limitation, however, states “each of the second set of log files” then later discloses “that particular one of the second set of log files”. It is unclear to the examiner which of the “each” is being referred to in the limitation that states “that particular one”. The examiner suggests applicant review and make changes where necessary. Dependent claims 2-4, 6-8, and 10-20 are also rejected for their dependencies and for failing to cure the deficiencies of the respective independent claims. Appropriate action is required.

Claims 1, 5, and 9, disclose “…requesting…a replaying of a subset…”. Further, claims 4, 8, and 12 disclose “…communicating a replay request…”. The examiner notes it is unclear if the request in claims 4, 8, and 12 is the same or different request disclosed in the respective independent claims. Dependent claims 2-4, 6-8, and 10-20 


Examiner Remarks
The examiner strongly suggests applicant review the entire claim set to identify additional 112(b) clarity and/or antecedent basis issues. The examiner notes it is applicant's duty to provide claim language referring to claimed elements that are clearly discernible (throughout independent and corresponding dependent claims) and that have proper antecedent basis. Applicant’s cooperation is required.  

The examiner notes upon further review of the claim language –especially in light of applicant’s arguments on the record, the examiner notes it has been determined the claimed invention is directed to an abstract idea. The limitations of each independent claims discloses comparing and ranking data merely to produce a subset of the already existing data. Further, applicant’s arguments confirms there is nothing more to the claimed concept besides ranking and comparing data. As a result, the 35 USC § 101 rejection is issued below. 


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful 

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  Claims 1-20 are directed to an abstract idea. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 

Step 1
The method, as claimed in claim 1, is directed to a process. The non-transitory machine-readable storage medium, as claimed in claim 5, is directed to an apparatus. The system, as claimed in claim 9, is directed to a machine.

Step 2A
While the claims fall within a statutory category, under revised Step 2A, Prong 1 of the 2019 PEG, the claimed invention is directed to (e.g. sets forth or describes) an abstract idea associated with tagging and ranking data. Specifically, the claims, each, directs itself to the abstract idea of: 

Claim 1:
“receiving, at a database system, the first set of log files, each of the first set of log files including a plurality of log file records; storing, at the database system, a second set of log files, each of the second set of log files corresponding to a respective one of the first set of log files; comparing, at the database system, record counts of the first set of log files with record counts of the second set of log files; ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files; determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts; and requesting, by the database system, a replaying of a subset of the replay candidate log files that achieves a target completeness value”.

Under revised Step 2A, Prong 1 of the 2019 PEG, it is necessary to evaluate whether the claim recites a judicial exception by referring to subject matter groupings articulated in the 2019 Revised Patent Subject Matter Eligibility Guidance, hereinafter referred to as the “2019 PEG”. When considering the 2019 PEG, the claims recite an abstract idea. For example, representative claim 1 recite an abstract idea associated with comparing and ranking data –irrespective of the type of data (i.e. log files). The concepts recited in representative claim 1 represent an idea 'of itself'. An idea ’of itself’ is used to describe an idea standing alone such as a concept, plan, or scheme, as well as a mental process (thinking) that "can be performed in the human mind or by a human using a pen and paper". Mental processes are defined by the 2019 PEG as including 
Under revised Step 2A, Prong 2 of the 2019 PEG, if it is determined that the claims recite a judicial exception, it is then necessary to evaluate whether the claims recite additional elements that integrate the judicial exception into a practical application of that exception. In this case, representative claim 1 and the remaining independent claims selectively include additional elements as follows: Claim 1 – database system, Claim 5 – database system, and Claim 9 – database system, and processor. Although each claim recites additional elements, the additional elements do not integrate the abstract idea into a practical application because they merely amount to no more than a general link of the use of the abstract idea to a particular technological environment or field of use. Specifying that the abstract idea associated with comparing and ranking data, in respective environments, merely indicates a field of use in which to apply the abstract idea because this requirement merely limits the claims to the computer field, i.e., to execution on generically claimed components. 

Step 2B
Under Step 2B of the 2019 PEG, if it is determined that the claims recite a judicial exception that is not integrated into a practical application of that exception, it is then necessary to evaluate the additional elements individually and in combination to determine whether they provide an inventive concept (i.e., whether the additional elements amount to significantly more than the exception itself). In this case, as noted above, the additional elements recited in independent claims 1, 5, and 9 are recited and 
Even when considered as an ordered combination, the additional elements in any of claims 1, 5, and 9 do not add anything that is not already present when the limitations of each claim are considered individually for the respective claim. When viewed as a whole, claims 1, 5, and 9 simply conveys the abstract idea itself facilitated by generically claimed computing components. Therefore, under Step 2B, there are no meaningful limitations in claims 1, 5, and 9 that transforms the judicial exception into a patent eligible application such that the claims amount to significantly more than the judicial exception itself.
As such, claims 1, 5, and 9 are ineligible. 

For example, claims 2-4, 6-8, and 10-20 further analyzes the log files and communicates the log files to different systems. 
In general, claims 2-4, 6-8, and 10-20 provide for further analysis and transmission of particular sets of data, and provide further embellishments of the limitations recited in the respective independent claims. The claims generally describe aspects of ranking and comparing log files, and the incorporation of the additional elements are disclosed at a high level of generality whereby they do not aid in providing a novelty or improvement to the already general data processing features of the impacted claims. 
Thus, dependent claims 2-4, 6-8, and 10-20 are also ineligible. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication, US 20020184570, to Michael Peter Etgen, hereinafter “Etgen”, in view of U.S. Patent Application, US 20130173691, to Paul De Schacht et al, hereinafter “Schacht”.

Regarding claim 1, Etgen teaches a method for improving completeness of a first set of log files (Etgen, ¶ [0001], teaches the present invention provides a method, apparatus, and computer implemented instructions for calculating data integrity metrics for Web server activity log analysis. Etgen, ¶ [0038], teaches this Web activity report also may include an identification as to the confidence or accuracy of the analysis. Further, Etgen, ¶ [0059], teaches the mechanism also may fill-in missing data to increase the accuracy or integrity of the report), comprising: 
receiving, at a database system, the first set of log files, each of the first set of log files including a plurality of log file records (Etgen, ¶ [0027], teaches the analysis is conducted using metrics database. The analysis may examine time segments within the log to determine whether time gaps between data are sufficient to indicate that a loss of data has occurred. Etgen, ¶ [0060], teaches logs may be selected from section for analysis. Logs may be added to section 404 by selecting “Add” button. ; 
storing, at the database system, a second set of log files, each of the second set of log files corresponding to a respective one of the first set of log files (Etgen, ¶ [0018], teaches server may store data in logs, which may reflect accesses and requests by clients. Further, Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system. Etgen, ¶ [0040], teaches the specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time); 
comparing, at the database system, record counts of the first set of log files with record counts of the second set of log files (Etgen, ¶ [0026], teaches the data integrity metrics are provided for variables, such as, for example, hits, requests, page views, and sessions. These data integrity metrics may be used by a log analysis process to “fill-in” holes in data if desired. For example, reports showing total hit counts may be generated with “fill-in” data when data integrity problems are identified. Such a report may state that the numbers reflect the likely number of total hits with some measure of data integrity. Further, Etgen, ¶ [0027], teaches the analysis may examine time segments within the log to determine whether time gaps between data are sufficient to indicate that a loss of data has occurred. This analysis may include ; 
Etgen teaches the limitations as identified above. 
Etgen does not explicitly teach:
ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and the corresponding one of the first set of log files; 
determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts; and 
requesting, by the database system, a replaying of a subset of the replay candidate log files that achieves a target completeness value.  
However, De Schacht teaches:
ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files (De Schacht, ¶ [0018], teaches the best-candidate file is chosen from a set of application data files for a given interval from the plurality of log servers and that have the same start and stop points. Further, De Schacht, ¶ [0020], teaches the best-candidate file having lost application messages and not having lost more than x percent of application messages for the interval is augmented by the lost application messages existing in other files of the set of files, x being predetermined. In one embodiment, x is comprised between fifteen and forty five), wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files (De Schacht, ¶ [0019], teaches the best-candidate file is chosen from among the chosen set of files, the file with the lowest application message loss rate. According to an advantageous embodiment, in case some application data files have the same number of application messages, then the best-candidate file is chosen from among the application data files with the lowest application message loss rate, the file with the lowest control message loss rate); 
determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts (De Schacht, ¶ [0054, 0055], discloses the best-candidate file having lost application messages and not having lost more than x percent of application messages for the interval is augmented by the lost application messages existing in other files of the set of files, x being predetermined. In one embodiment, x is comprised between fifteen and forty five … upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing); and 
requesting, by the database system, a replaying of a subset of the replay candidate log files that achieves a target completeness value (De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by retrieving a part of missing messages in other synchronized files. The improvement is only done for synchronized files where the best-candidate has lost less than x % of the messages (i.e. the number of received messages is greater than (100−x) %). ≦x≦45 and more advantageously x=30, i.e. the number of received messages is greater than 70%. If even the best-candidate file has lost more than x % messages, it is considered that the other application data file cannot provide the missing messages).  
The claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. It would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the analyses functions, described by Etgen, with the system for high reliability and high performance, described in De Schacht, to rank server log files. Modification would have been obvious to one of ordinary skill in the art because this would give visibility to help modify and adjust log files for prior and/or changing circumstances. Motivation to do so would be to ensure the integrity of the data. Etgen, ¶ [0026], teaches calculating data integrity metrics for Web server activity log analysis. The data integrity metrics are provided for variables, such as, for example, hits, requests, page views, and sessions. These data integrity metrics may be used by a log analysis process to “fill-in” holes in data if desired. For example, reports showing total hit counts may be generated with “fill-in” data when data integrity problems are identified. Such a report may state that the numbers reflect the likely number of total hits with some measure of data integrity.

Regarding claim 2, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches replaying the subset of replay candidate log files includes analyzing intermediate log record counts to determine gaps in the candidate log files (Etgen, ¶ [0040], teaches the process begins with a specification of time segments and time gap tolerance (step  600 ). The specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time. The number of chunks may vary because of differences in log file time coverage. For example, the first chunk may be the combined data from all four logs for the time period of 12:00 a.m.-12:29 a.m).  

Regarding claims 3, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches generating the first set of log files at one or more host servers associated with the database system (Etgen, ¶ [0020], teaches logs generated by a server); and 
communicating the first set of log files from the one or more host servers to the database system (Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system).  

Regarding claim 4, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the requesting comprises: communicating a replay request from the database system to one or more host servers associated with the database system, wherein the one or more host servers respond to the replay request by providing the subset of the replay candidate log files (De Schacht, ¶ [0078], teaches the system of the present invention must receive by determining the best-candidate of log files 209(a)-(c) on each of log servers 203(a)-(b). The decision of the best-candidate is done by the correlation batch. De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by retrieving a part of missing messages in other synchronized files. The improvement is only done for synchronized files where the best-candidate has lost less than x % of the messages (i.e. the number of received messages is greater than (100−x) %)).

Regarding claim 5, Etgen teaches a non-transitory computer readable medium having computer-executable instructions stored thereon and configurable to be executed by a processor to perform a method (Etgen, ¶ [0001], teaches the present invention provides a method, apparatus, and computer implemented instructions for calculating data integrity metrics for Web server activity log analysis. Etgen, ¶ [0060], teaches the processes of the present invention are capable of being distributed in the form of a computer readable medium of instructions) comprising:
receiving, at a database system, the first set of log files, each of the first set of log files including a plurality of log file records (Etgen, ¶ [0027], teaches the analysis is conducted using metrics database. The analysis may examine time ; 
storing, at the database system, a second set of log files, each of the second set of log files corresponding to a respective one of the first set of log files (Etgen, ¶ [0018], teaches server may store data in logs, which may reflect accesses and requests by clients. Further, Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system. Etgen, ¶ [0040], teaches the specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time); 
comparing, at the database system, record counts of the first set of log files with record counts of the second set of log files (Etgen, ¶ [0026], teaches the data integrity metrics are provided for variables, such as, for example, hits, requests, page views, and sessions. These data integrity metrics may be used by a log analysis process to “fill-in” holes in data if desired. For example, reports showing total hit counts may be generated with “fill-in” data when data integrity problems are identified. Such a report may state that the numbers reflect the likely number of total hits with some ; 
Etgen teaches the limitations as identified above. 
Etgen does not explicitly teach:
ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and the corresponding one of the first set of log files; 
determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts; and 
requesting, by the database system, a replaying of a subset of the replay candidate log files that achieves a target completeness value.  
However, De Schacht teaches:
ranking, in response to the comparing, the second set of log files based on a completeness impact determined for each of the second set of log files (De Schacht, ¶ [0018], teaches the best-candidate file is chosen from a set of application data files for a given interval from the plurality of log servers and that have the same start and stop points. Further, De Schacht, ¶ [0020], teaches the best-candidate file having lost application messages and not having lost more than x percent of application , wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files (De Schacht, ¶ [0019], teaches the best-candidate file is chosen from among the chosen set of files, the file with the lowest application message loss rate. According to an advantageous embodiment, in case some application data files have the same number of application messages, then the best-candidate file is chosen from among the application data files with the lowest application message loss rate, the file with the lowest control message loss rate); 
determining a set of replay candidate log files based on the ranking of the second set of log files and their respective completeness impacts (De Schacht, ¶ [0054, 0055], discloses the best-candidate file having lost application messages and not having lost more than x percent of application messages for the interval is augmented by the lost application messages existing in other files of the set of files, x being predetermined. In one embodiment, x is comprised between fifteen and forty five … upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing); and 
requesting, by the database system, a replaying of a subset of the replay candidate log files that achieves a target completeness value (De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by ≦x≦45 and more advantageously x=30, i.e. the number of received messages is greater than 70%. If even the best-candidate file has lost more than x % messages, it is considered that the other application data file cannot provide the missing messages).  
The claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. It would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the analyses functions, described by Etgen, with the system for high reliability and high performance, described in De Schacht, to rank server log files. Modification would have been obvious to one of ordinary skill in the art because this would give visibility to help modify and adjust log files for prior and/or changing circumstances. Motivation to do so would be to ensure the integrity of the data. Etgen, ¶ [0026], teaches calculating data integrity metrics for Web server activity log analysis. The data integrity metrics are provided for variables, such as, for example, hits, requests, page views, and sessions. These data integrity metrics may be used by a log analysis process to “fill-in” holes in data if desired. For example, reports showing total hit counts may be generated with “fill-in” data when data integrity problems are identified. Such a report may state that the numbers reflect the likely number of total hits with some measure of data integrity.

Regarding claim 6, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches replaying the subset of replay candidate log files includes analyzing intermediate log record counts to determine gaps in the candidate log files (Etgen, ¶ [0040], teaches the process begins with a specification of time segments and time gap tolerance (step  600 ). The specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time. The number of chunks may vary because of differences in log file time coverage. For example, the first chunk may be the combined data from all four logs for the time period of 12:00 a.m.-12:29 a.m).  

Regarding claim 7, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches the first set of log files is received from one or more host servers associated with the database system (Etgen, ¶ [0008], teaches addressing data integrity in logs in a data processing system. Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system. Further, Etgen, ¶ [0027], teaches this analysis may include comparing the log to prior logs to determine .

Regarding claim 8, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the requesting comprises: communicating a replay request from the database system to one or more host servers associated with the database system, wherein the one or more host servers respond to the replay request by providing the subset of the replay candidate log files (De Schacht, ¶ [0078], teaches the system of the present invention must receive by determining the best-candidate of log files 209(a)-(c) on each of log servers 203(a)-(b). The decision of the best-candidate is done by the correlation batch. De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by retrieving a part of missing messages in other synchronized files. The improvement is only done for synchronized files where the best-candidate has lost less than x % of the messages (i.e. the number of received messages is greater than (100−x) %)).

Regarding claim 9, Etgen teaches a database system comprising a processor in communication with a memory element having computer executable instructions stored thereon and configurable to be executed by the processor (Etgen, ¶ [0021], teaches data processing system may be a symmetric multiprocessor (SMP) system including a plurality of processors connected to system bus. Alternatively, a single processor system may be employed. Also connected to system bus is memory to cause the database system to: 
receive a first set of log files, each of the second set of log files including a plurality of log file records (Etgen, ¶ [0027], teaches the analysis is conducted using metrics database. The analysis may examine time segments within the log to determine whether time gaps between data are sufficient to indicate that a loss of data has occurred. Etgen, ¶ [0060], teaches logs may be selected from section for analysis. Logs may be added to section 404 by selecting “Add” button. Selection of this button results in a display of a window or menu presenting logs that may be selected for analysis); 
store a second set of log files, each of the second set of log files corresponding to a respective one of the first set of log files (Etgen, ¶ [0018], teaches server may store data in logs, which may reflect accesses and requests by clients. Further, Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system. Etgen, ¶ [0040], teaches the specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time); 
compare record counts of the first set of log files with record counts of the second set of log files (Etgen, ¶ [0026], teaches the data integrity metrics are provided ; 
Etgen teaches the limitations as identified above. 
Etgen does not explicitly teach:
rank, in response to comparing the record counts, the second set of log files based on a completeness impact determined for each of the second set of log files, wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files; 
determine a set of replay candidate log files based on ranking of the second set of log files and their respective completeness impacts; and 
request a replaying of a subset of the replay candidate log files that achieves a target completeness value.
However, De Schacht teaches:
rank, in response to comparing the record counts, the second set of log files based on a completeness impact determined for each of the second set of log files (De Schacht, ¶ [0018], teaches the best-candidate file is chosen from a set of application data files for a given interval from the plurality of log servers and that have the same start and stop points. Further, De Schacht, ¶ [0020], teaches the best-candidate file having lost application messages and not having lost more than x percent of application messages for the interval is augmented by the lost application messages existing in other files of the set of files, x being predetermined. In one embodiment, x is comprised between fifteen and forty five), wherein the completeness impact for each of the second set of log files is computed based on a difference in record counts between that particular one of the second set of log files and a corresponding one of the first set of log files (De Schacht, ¶ [0019], teaches the best-candidate file is chosen from among the chosen set of files, the file with the lowest application message loss rate. According to an advantageous embodiment, in case some application data files have the same number of application messages, then the best-candidate file is chosen from among the application data files with the lowest application message loss rate, the file with the lowest control message loss rate); 
determine a set of replay candidate log files based on ranking of the second set of log files and their respective completeness impacts (De Schacht, ¶ [0027], teaches a billing server as implemented in the system of the present invention must receive by determining the best-candidate of log files on each of log servers. The decision of the best-candidate is done by the correlation batch. Further, De Schacht, ¶ [0060], teaches the purpose of the control messages is two-fold. First, these messages ; and 
request a replaying of a subset of the replay candidate log files that achieves a target completeness value (De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by retrieving a part of missing messages in other synchronized files. The improvement is only done for synchronized files where the best-candidate has lost less than x % of the messages (i.e. the number of received messages is greater than (100−x) %). Advantageously, 15≦x≦45 and more advantageously x=30, i.e. the number of received messages is greater than 70%. If even the best-candidate file has lost more than x % messages, it is considered that the other application data file cannot provide the missing messages).
The claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. It would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the analyses functions, described by Etgen, with the system for high reliability and high performance, described in De Schacht, to rank server log files. Modification would have been obvious to one of ordinary skill in the art because this would give visibility to help modify and adjust log files for prior and/or changing circumstances. Motivation to do so would be to ensure the integrity of the 

Regarding claim 10, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches replaying the subset of replay candidate log files includes analyzing intermediate log record counts to determine gaps in the candidate log files (Etgen, ¶ [0040], teaches the process begins with a specification of time segments and time gap tolerance (step  600 ). The specification of time gap tolerances may be made through an automated process analyzing “clean” example logs. As used herein, “clean” logs are logs that are known to contain no time gaps due to data loss. In this example, a user provides 4 example logs that are believed to be free of data loss. The process in step 600 goes through the logs record by record and combines all of the data from matching 30 minute chunks of time. The number of chunks may vary because of differences in log file time coverage. For example, the first chunk may be the combined data from all four logs for the time period of 12:00 a.m.-12:29 a.m).  

Regarding claim 11, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches the first set of log files is received from one or more host servers associated with the database system (Etgen, ¶ [0008], teaches addressing data integrity in logs in a data processing system. Etgen, ¶ [0020], teaches the analysis of logs generated by a server may be analyzed using a data processing system similar to data processing system. Further, Etgen, ¶ [0027], teaches this analysis may include comparing the log to prior logs to determine whether any deviations in the current log are sufficient to warrant an indication that a data loss has occurred).

Regarding claim 12, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the requesting comprises: communicating a replay request from the database system to one or more host servers associated with the database system, wherein the one or more host servers respond to the replay request by providing the subset of the replay candidate log files (De Schacht, ¶ [0078], teaches the system of the present invention must receive by determining the best-candidate of log files 209(a)-(c) on each of log servers 203(a)-(b). The decision of the best-candidate is done by the correlation batch. De Schacht, ¶ [0084], teaches the system also improves the quality of the selected best-candidate by retrieving a part of missing messages in other synchronized files. The improvement is only done for synchronized files where the best-candidate has lost less than x % of the messages (i.e. the number of received messages is greater than (100−x) %)).

Regarding claim 13, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches the step of defining a plurality of completeness levels associated with different log file completeness at the database system (Etgen, ¶ [0033], teaches a user may manually define time segments and time gap tolerances by selecting option in window 402 in FIG. 4A),4Preliminary Amendment wherein the subset of the replay candidate log files to be replayed is determined based on a selected one of the plurality of completeness levels (Etgen, ¶ [0058-0059], teaches the present invention provides an improved method, apparatus, and computer implemented instructions for calculating data integrity metrics for Web server activity log analysis. This mechanism provides an ability to identify when a log is missing data. The mechanism includes determining whether time gaps for data points, such as hits, page views, or session exceed some threshold indicating that data is missing … Additionally, the mechanism also may fill-in missing data to increase the accuracy or integrity of the report. The data used to fill-in missing data is taken from prior logs in these examples. The actual data selected is based on comparing similar times, dates, or days of the week from the prior logs with the corresponding times, dates, or days of the week in the portion of the log in which data is missing. Of course other mechanisms or dimensions may be used to identify or recreate the missing data depending on the particular implementation).  
It would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the defining features, described by Etgen, with the system for high reliability and high performance, described in De Schacht, to set 

Regarding claim 14, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the first set of log files are received from a plurality of host servers (De Schacht, ¶ [0008], teaches at a plurality of log servers coupled to at least an application server, each application server being associated to an application: receiving asynchronously, from the at least one application server, application messages containing application information for an application transaction, each application message being received by at least some log servers among the plurality of log servers); and 
the replay candidate log files are replayed only from a selected number of the plurality of host servers (De Schacht, ¶ [0060], teaches each control message will be used by the correlation algorithm to select the best-candidate amongst the synchronized files. Further, De Schacht, ¶ [0096], teaches supplement any missing .  

Regarding claim 15, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the first set of log files are received from a plurality of host servers (De Schacht, ¶ [0008], teaches at a plurality of log servers coupled to at least an application server, each application server being associated to an application: receiving asynchronously, from the at least one application server, application messages containing application information for an application transaction, each application message being received by at least some log servers among the plurality of log servers); and 
the ranking of the second set of log files results in a list of host servers ordered according to their impact on a log completeness value that is compared to the target completeness value (De Schacht, ¶ [0055], teaches upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing. Further, De Schacht, ¶ [0080], teaches the system aligns the open file/close file events in different control files of each log server 203(a)-(b). The alignment is based on the timestamp of the events. A quorum of ┌(n+1)/2┐ is needed to agree on an alignment. The alignment simply indicates the files for which the stream has been split on identical points in time. In this nominal case, the system determines the best-candidate amongst the synchronized application data files .  

Regarding claim 16, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches the step of defining a plurality of completeness levels associated with different log file completeness at the database system (Etgen, ¶ [0033], teaches a user may manually define time segments and time gap tolerances by selecting option in window 402 in FIG. 4A),4Preliminary Amendment wherein the subset of the replay candidate log files to be replayed is determined based on a selected one of the plurality of completeness levels (Etgen, ¶ [0058-0059], teaches the present invention provides an improved method, apparatus, and computer implemented instructions for calculating data integrity metrics for Web server activity log analysis. This mechanism provides an ability to identify when a log is missing data. The mechanism includes determining whether time gaps for data points, such as hits, page views, or session exceed some threshold indicating that data is missing … Additionally, the mechanism also may fill-in missing data to increase the accuracy or integrity of the report. The data used to fill-in missing data is taken from prior logs in these examples. The actual data selected is based on comparing similar times, dates, or days of the week from the prior logs with the corresponding times, dates, or days of the week in the portion of the log in which data is missing. Of course other mechanisms or dimensions may be used to identify or recreate the missing data depending on the particular implementation).  


Regarding claim 17, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the first set of log files are received from a plurality of host servers (De Schacht, ¶ [0008], teaches at a plurality of log servers coupled to at least an application server, each application server being associated to an application: receiving asynchronously, from the at least one application server, application messages containing application information for an application transaction, each application message being received by at least some log servers among the plurality of log servers); and 
the replay candidate log files are replayed only from a selected number of the plurality of host servers (De Schacht, ¶ [0060], teaches each control message will be used by the correlation algorithm to select the best-candidate amongst the synchronized files. Further, De Schacht, ¶ [0096], teaches supplement any missing application messages from the application data files for the same interval of the other log servers into the best-candidate log file).  

Regarding claim 18, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the first set of log files are received from a plurality of host servers (De Schacht, ¶ [0008], teaches at a plurality of log servers coupled to at least an application server, each application server being associated to an application: receiving asynchronously, from the at least one application server, application messages containing application information for an application transaction, each application message being received by at least some log servers among the plurality of log servers); and 
the ranking of the second set of log files results in a list of host servers ordered according to their impact on a log completeness value that is compared to the target completeness value (De Schacht, ¶ [0055], teaches upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing. Further, De Schacht, ¶ [0080], teaches the system aligns the open file/close file events in different control files of each log server 203(a)-(b). The alignment is based on the timestamp of the events. A quorum of .  

Regarding claim 19, Etgen and De Schacht teaches the claimed invention substantially as claimed, and Etgen further teaches the step of defining a plurality of completeness levels associated with different log file completeness at the database system (Etgen, ¶ [0033], teaches a user may manually define time segments and time gap tolerances by selecting option in window 402 in FIG. 4A),4Preliminary Amendment wherein the subset of the replay candidate log files to be replayed is determined based on a selected one of the plurality of completeness levels (Etgen, ¶ [0058-0059], teaches the present invention provides an improved method, apparatus, and computer implemented instructions for calculating data integrity metrics for Web server activity log analysis. This mechanism provides an ability to identify when a log is missing data. The mechanism includes determining whether time gaps for data points, such as hits, page views, or session exceed some threshold indicating that data is missing … Additionally, the mechanism also may fill-in missing data to increase the accuracy or integrity of the report. The data used to fill-in missing data is taken from prior logs in these examples. The actual data selected is based on comparing similar times, dates, or days of the week from the prior logs with the corresponding times, dates, or days of the week in the portion of the log in which data is missing. Of course other mechanisms or dimensions .  
It would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the defining features, described by Etgen, with the system for high reliability and high performance, described in De Schacht, to set threshold for determining best candidate log file with less data loss. Modification would have been obvious to one of ordinary skill in the art because this would show the reliability of log data. Motivation to do so would be to ensure the integrity of the data. Etgen, ¶ [0026], teaches calculating data integrity metrics for Web server activity log analysis. The data integrity metrics are provided for variables, such as, for example, hits, requests, page views, and sessions. These data integrity metrics may be used by a log analysis process to “fill-in” holes in data if desired. For example, reports showing total hit counts may be generated with “fill-in” data when data integrity problems are identified. Such a report may state that the numbers reflect the likely number of total hits with some measure of data integrity.

Regarding claim 20, Etgen and De Schacht teaches the claimed invention substantially as claimed, and De Schacht further teaches the first set of log files are received from a plurality of host servers (De Schacht, ¶ [0008], teaches at a plurality of log servers coupled to at least an application server, each application server being associated to an application: receiving asynchronously, from the at least one application server, application messages containing application information for an application ; 
the ranking of the second set of log files results in a list of host servers ordered according to their impact on a log completeness value that is compared to the target completeness value (De Schacht, ¶ [0055], teaches upon determining from among a plurality of application data files from each of the plurality of log servers, an application data file as a best-candidate for a given interval, the server forwards the best-candidate file for application processing. Further, De Schacht, ¶ [0080], teaches the system aligns the open file/close file events in different control files of each log server 203(a)-(b). The alignment is based on the timestamp of the events. A quorum of ┌(n+1)/2┐ is needed to agree on an alignment. The alignment simply indicates the files for which the stream has been split on identical points in time. In this nominal case, the system determines the best-candidate amongst the synchronized application data files 209(a)-(c) by selecting the application data file that contains firstly the most messages and secondly the least lost checkpoint messages); and 
the replay candidate log files are replayed only from a selected number of the plurality of host servers (De Schacht, ¶ [0060], teaches each control message will be used by the correlation algorithm to select the best-candidate amongst the synchronized files. Further, De Schacht, ¶ [0096], teaches supplement any missing application messages from the application data files for the same interval of the other log servers into the best-candidate log file).


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALICIA M ANTOINE whose telephone number is (571)431-0687.  The examiner can normally be reached on Mon - Fri: 9am - 3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ALICIA M ANTOINE/Examiner, Art Unit 2162                                                                                                                                                                                                        2/2/2021


/PIERRE M VITAL/Supervisory Patent Examiner, Art Unit 2162