DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent International Application No. 
PCT/CN2016/099054, filed on 9/14/2016, which is based on and claims the benefits of priority to Chinese Application No. 201510625059.4, filed on 9/25/2015.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/20/2021 has been entered.
 

Remarks
In response to the communication filed on January 20th, 2021, claims 1, 3, 5, 12-14, and 23 were amended. Claims 1, 3-19 and 23 are presently pending in the application.

Response to Arguments
Applicant’s arguments with respect to claim 1, regarding Marrelli as modified by Bentkofsky fails to teach, “n response to the total amount of dirty data not reaching the fault tolerant threshold, skipping the detected one or more pieces of dirty data and continuing with synchronizing one or more pieces of data in the to-be-synchronized data after the detected one or more pieces of dirty data”. have been considered but are moot in view of the new grounds of rejection. The examiner has introduced a new reference disclosing the new limitation by the applicant and therefore, the claims are still rejected, as incorporated by Gao. This would meet the limitations for detecting and skipping “dirty data” as it is currently presented in the amended claim language. For these reasons the 103 rejection is maintained.  

	


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 

Claims 1-19 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Marrelli et al., U.S. PGPub Number 20160070725 (Hereinafter Marrelli), in view of Bentkofsky et al., U.S. PGPub Number 20130073524 (Hereinafter Bentkofsky), in further view of Gao et al. U.S. PGPub number 20120221509 (Hereinafter Gao).



As for claim 1, Marrelli teaches a data synchronization method, comprising: acquiring source data from a source database (Marrelli; Client systems 114 enable users to communicate with server systems 110 to perform data quality analysis, cleansing, and transformation for migration of data from source systems 140 to target system 150. The server systems include a database management system 116 including analysis modules 120 to perform the data quality analysis, data cleansing, data transformations, and data migration as described below. [0026];); 
converting the source data into to-be-synchronized data matching a data meta format of a target database (Marrelli; Alignment area 124 receives and stores data of source systems 140 from each staging area 122 (associated with a corresponding source system 140). The alignment area includes a common data model to receive data from each of the staging areas (and corresponding source systems 140). The common data model of the alignment area is derived from the data model of target system 150. However, the common data model varies slightly from the data model of the target system in order to enable source data records to be processed by a common cleansing data from staging areas 122 is transformed for transference to alignment area (converting the source data into to-be-synchronized data matching) 124. [0033]; Specifically, data from source systems 140 (FIG. 2) is received and stored in corresponding staging areas 122 for data quality assessment based on data quality rules for the target system at step 305. Data quality profiler module 128 utilizes a commercial data profiling tool to read data from source systems 140, create staging areas 122 based on the data models from the source systems, and move the data from the source systems to those staging areas. Business metadata may be utilized by data quality profiler module 128 to direct the data profiling tool to extract and create staging areas (data matching a data meta format of a target database) 122 for specific data within source systems 140. For example, the business metadata may indicate which data from the source systems are critical to business or other processes of the target system or are required by the target system. In this case, data quality profiler module 128 may initiate extraction and creation of staging areas 122 for data critical to and/or required by the target system. [0042];); 
synchronizing the to-be-synchronized data to the target database, comprising: determining whether one or more pieces of dirty data have been detected during the synchronization (Marrelli; Referring to FIG. 2, data migration projects typically include one or more source systems 140 and target system 150. Data is harmonized across source systems 140 and cleansed to fulfill data quality requirements for business or other processes of target system (synchronizing the to-be-synchronized data to the target database) 150 and/or to satisfy key performance indicators (KPI) necessary for the target system. The data usually resides in database 118 of database 0031]; These statuses may be determined during the source analysis phase for data records with data attributes (dirty data have been detected during the synchronization) deemed business critical, during the target process phase for data records with data attributes required for in-scope (or relevant) business processes, and in the load analysis phase for data records with data attributes of in-scope (or relevant) data domains. However, the statuses may be determined for any desired data records with any data attributes. Further, a data record may be associated with one or more of these statuses each associated with a corresponding data attribute. For example, a data record with a data attribute problematic in the source system and another data attribute problematic in the target system may be designated with the statuses of Dirty, Action needed in source and Fit for use, Conversion needed. [0052];); 
determining whether one or more pieces of dirty data have been detected during the synchronization (Marrelli; Data quality profiler module 128 (e.g., via one or more server systems 110) applies the associated sets of data quality rules (for source systems 140 and corresponding data attributes of target system 150) and (LS2T) mappings to the corresponding data attributes of the source systems to determine compliance of the data attributes with those source and target system rules. A record containing the data attribute is considered actionable or problematic (e.g., dirty) (determining whether dirty data has been detected) with respect to a source or 0045];).
Marrelli does not explicitly detail counting a total amount of dirty data including the detected one or more pieces of dirty data accumulated during the synchronization: determining whether the total amount of dirty data accumulated during the synchronization has reached a fault tolerant threshold that is set to lower an impact on the synchronization; and in response to the total amount of dirty data not reaching the fault tolerant threshold, skipping the detected one or more pieces of dirty data and continuing with synchronizing one or more pieces of data in the to-be-synchronized data after the detected one or more pieces of dirty data
However, Bentkofsky teaches and in response to detecting one or more pieces of dirty data during the synchronization, skipping the detected one or more pieces of dirty data and continuing with synchronizing one or more pieces of data in the to-be-synchronized data after the detected one or more pieces of dirty data (Bentkofsky; For example, as records are identified by scanning the primary key hash chain, the record may be identified as "dirty", such as by recognizing a marking of the record with a dirty bit indicator, if it is part of an ongoing transaction, such as an update with one or more operations. Under normal circumstance, searches that encounter a dirty record indicating an update transaction that is underway would restart the search transaction while the update completes (continuing with synchronizing one or more pieces of data in the to-be-synchronized data after the detected one or more pieces of dirty data). The dirty bit audit may (1) record dirty records (one or more pieces of dirty data), and later (2) check that the record is no longer marked as dirty, or that the record record is marked as dirty (skipping the detected dirty data to continue with synchronizing the to-be-synchronized data) for more than 3 seconds in total, or the record has been considered dirty through at least 3 passes after one second of the record first being identified as dirty, the record may be considered persistently dirty. Any records meeting designated criteria may be logged as a persistent dirty record. [0079];).  
It would have been obvious to one of ordinary skill in the art before the effective filing date, having both the teachings of Marrelli and Bentkofsky which deal with synchronizing data across different databases, to have combined them by incorporating continuing synchronization of data when detecting data that is dirty (Bentkofsky) with acquiring, converting and detecting dirty data when synchronizing between databases (Marrelli). The motivation to combine is to make the system more efficient as to be be beneficial in the timely and accurate discovery of data errors in databases, such as distributed database, and may further help to identify software errors and the like manifesting in shared memory database, before such errors would be discovered through routine scrubbing techniques (Bentkofsky [0005];).
Marrelli as modified by Bentkofsky does not explicitly detail detecting one or more pieces of dirty data from the to-be- synchronized data during the synchronization, 
However Gao teaches detecting one or more pieces of dirty data from the to-be- synchronized data during the synchronization, counting a total amount of dirty data including the detected one or more pieces of dirty data accumulated during the synchronization: determining whether the total amount of dirty data accumulated during the synchronization has reached a fault tolerant threshold that is set to lower an impact on the synchronization (Gao; For the case where the determining in the above sub-steps 302-1 and 302-3 is performed based on the corresponding primary key value sets corresponding to attribute values exceeding a specified threshold percentage (fault tolerant threshold) in all the attribute values of the specific attribute of the target database table, in the sub-step 304-4, it may be determined, (counting a total amount of dirty data including the detected one or more pieces of dirty data accumulated during the synchronization: determining whether the total amount of dirty data accumulated during the synchronization has reached a fault tolerant threshold) with respect to the remaining attribute values other than the attribute values of the specific attribute of the target database table based on which the determining is performed in sub-step 302-1 and 302-3, whether the remaining attribute values have corresponding attribute values of the at least one other attribute of the source database table. In such a case, sub-step 304-4 may be executed during the execution of sub-step 302-1 or sub-step 302-3. That is to say, at the same time of determining whether the determine whether the specific attribute value of the specific attribute of the target database table is an orphaned value or dirty data (determining whether the total amount of dirty data accumulated during the synchronization), and thus being a data error. [0086];). 
It would have been obvious to one of ordinary skill in the art before the effective filing date, having both the teachings of Marrelli as modified by Bentkofsky and Gao which deal with determining dirty data (Gao) with synchronizing data across different databases, to have combined them by incorporating and continuing synchronization of data when detecting data that is dirty and acquiring, converting and detecting dirty data when synchronizing between databases (Marrelli as modified by Bentkofsky). The motivation to combine is to make the system more efficient as to be generates a query to a source database (i.e., a business system database as the data source) according to the business meaning, acquires the source data by executing the query against the source database, and compares the source data with the target data to find data errors (Gao [0004];).




Claim 23 comprises the same limitations as claim 1, rejection rationale for clam 1 applicable. 




As for claims 3 and 14 , Marrelli as modified by Bentkofsky and Gao teaches the method and apparatus of claims 1 and 12, further comprising: synchronizing the to-be-synchronized data to the target database in a batch including a plurality of pieces of data (Bentkofsky; As shown in FIG. 16, local database 1510 sends F sendfiles 1620-1 . . . 1620-F and initializing sendfile 1610 to remote database 1520 in order to update remote database 1520. The update files may be sent individually or in batches (synchronizing the to-be-synchronized data to the target database in a batch including a plurality of pieces), such as multiple sendfiles 1620, one sendfile 1620 and one initializing sendfile 1610, multiple sendfiles 1620 and one initializing sendfile 1610, sendfile 1620 alone, or initializing sendfile 1610 alone. [0128];); 
determining whether one or more pieces of dirty data have been detected in the batch (Bentkofsky; As subsequent records are found using the primary index, the list may be consulted to determine if records are persistently dirty (determining whether the dirty data has been detected in the batch). A record may be considered 0079];); 
notifying, in response to detecting one or more pieces of  data in the batch, the target database to roll back the synchronized data corresponding to the batch of the to-be-synchronized data (Bentkofsky; Transaction log checking involves checking a transaction log to confirm that the entries are consistent with the transaction state of the database. Typically, the transaction log details the individual operations that are performed for each database transaction, whether the changes are committed to the database or not. Uncommitted transactions may be rolled back (the target database to roll back the synchronized data in the to-be-synchronized data) during system restart so that a consistent database image is always present when the system is functioning normally. Transaction processing may involve the use of the transaction log, usually with a head (oldest logged operation) and a tail (newest logged operation). The individual operations logged depend on the database structure, but, in general, logical rules may be constructed to validate a state of the database against operations read from the log. [0091];); 
and synchronizing the rolled-back batch of the to-be-synchronized data to the target database based on a piece-by- piece transmission after the target database rolls back (Bentkofsky; Once the shared memory database map is initialized, the structures within the shared memory database are "walked", that is all references to key records the structure may be more deeply scanned allocating sub-parts of the structure to the memory map (based on a piece-by- piece transmission after the target database rolls back). As each bit is set, the bit offset may be first checked to be certain that it hasn't previously been set. This condition would indicate that a byte was used more than once. [0112];); 
and determining whether a transmitted piece of data includes the dirty data (Marrelli; The individual phases of the data quality analysis may be repeated any quantity of times until data is sufficiently cleansed. The thresholds for data quality scores may include any values indicating sufficient cleanliness or dirtiness of the data (e.g., threshold percentages (e.g., 60%, 70%, greater than (or equal to) a certain percentage, etc.) for clean data, threshold percentages for dirty data (e.g., 20%, 30%, less than (or equal to) a certain percentage, etc.), etc.) to determine whether further data cleansing should be performed. The individual phases of the data quality analysis may be performed serially and/or in parallel during any portion of the data migration (determining whether a transmitted piece of data includes the dirty data). [0138];). The motivation to combine is the same as previously presented. 


As for claims 4 and, Marrelli as modified by Bentkofsky and Gao teaches the method and apparatus of claims 2 and 13, further comprising: in response to the transmitted piece of data including the dirty data, recording the transmitted piece of data Marrelli; These statuses may be determined during the source analysis phase for data records with data attributes deemed business critical, during the target process phase for data records with data attributes required for in-scope (or relevant) business processes, and in the load analysis phase for data records with data attributes of in-scope (or relevant) data domains. However, the statuses may be determined for any desired data records with any data attributes. Further, a data record may be associated with one or more of these statuses each associated with a corresponding data attribute (recording the transmitted piece of data as the dirty data). For example, a data record with a data attribute problematic in the source system and another data attribute problematic in the target system may be designated with the statuses of Dirty, Action needed in source and Fit for use, Conversion needed. [0052];).  


As for claims 5 and 16, Marrelli as modified by Bentkofsky and Gao teaches the method and apparatus of claims 1 and 13, further comprising: decomposing a synchronization job task into at least two sub-tasks (Bentkofsky; FIG. 1 depicts aspects of a high-level representation of checks according to an exemplary embodiments of the invention. As shown in FIG. 1, methods for verifying data in a distributed database may include executing one or more checks by a computer processor at different times with respect to other database functions, such as update operations. As discussed herein, an "update" may include one or more specific update operations, tasks or functions, that may operate on different database components (decomposing a synchronization job task into at least two sub-tasks). This may be accomplished in various ways, such as, for example, including multiple update instructions in a sendfile, performing an update operation including writing a new element and deleting an old element or pointer to the old element, applying a similar update to various database records, etc. [0049];).  The motivation to combine is the same as previously presented. 


As for claims 6 and 17, Marrelli as modified by Bentkofsky and Gao teaches the method and, apparatus of claims 5 and 16, wherein acquiring source data from a source database further comprises: acquiring, by each sub-task from the database, source data to be processed by the sub- task itself (Marrelli; The weights are utilized to generate a weighted data quality score that provides a view of source data that needs to be cleansed prior to migrating the source data (acquiring, by each sub-task from the database, source data to be processed by the sub- task itself) to the target system and a prioritization direction for the data cleansing effort as described below. However, any desired weight values may be assigned to the data attributes of the source systems to reflect importance of the data attributes to the target system and business or other processes of the target system. Further, the designation of data attributes as business critical and/or target based may be determined by user analysis of the target system and/or various computerized tools (e.g., to determine the mandatory or required attributes or fields of the target system). [0056];).  


Marrelli as modified by Bentkofsky and Gao teaches the method and, apparatus of claims 5 and 16, wherein counting the total amount of dirty data further comprises: determining the total amount of dirty data according to the dirty data detected by the sub- tasks (Marrelli; The combined quantity is divided by a total quantity of data records containing the data attributes of interest of the data domain on source systems (determining the total amount of dirty data according to the dirty data detected by the sub- tasks) 140a, 140b to produce the overall dimension percentage value for the accuracy data quality dimension with respect to source systems 140a, 140b. This total quantity may be determined by combining or summing individual total quantities previously determined by the source systems for computation of the aggregate dimension percentage values described above. The overall dimension percentage value is typically normalized to an integer value between zero and one-hundred percent (e.g., rounding, truncation, etc.), but may be any value within any desired value range. [0082];).  


As for claims 8 and 19, Marrelli as modified by Bentkofsky and Gao teaches the method and, apparatus of claims 7 and 17, wherein determining the total amount of dirty data according to the dirty data detected by the sub-tasks further comprises: performing polling on the sub-tasks to collect recorded dirty data; and summarizing polling results to generate the total amount of dirty data (Marrelli; In addition, the various percentage values for a domain of a source system and for a domain across plural source systems may be provided in a table or chart as illustrated, by way of example, at flows 715, 735, placed in a report for determining cleansing activities (summarizing polling results to generate the total amount of dirty data). For example, flow 715 illustrates a table or chart for a data domain of source system 140a indicating an aggregate dimension percentage value of 59% for the accuracy data quality dimension, an aggregate dimension percentage value of 75% for the completeness data quality dimension (performing polling on the sub-tasks to collect recorded dirty data), a domain percentage value (e. "Overall Score") of 49%, a business critical percentage value of 47%, and a target based percentage value of 67%. [0089];).  


As for claim 9, Marrelli as modified by Bentkofsky and Gao teaches the method, of claims 8, further comprising: generating a dirty data list based on the summarized dirty data (Bentkofsky; A dirty bit audit checking may be configured to add records to a list of records (generating a dirty data list based on the summarized dirty data) that are initially found to be dirty, and also to remove records that are subsequently found to not be dirty from the list. [0081];). The motivation to combine is the same as previously presented. 



Marrelli as modified by Bentkofsky and Gao teaches the method, of claims 1,  further comprising: pre-detecting the to-be-synchronized data to filter out dirty data in the to-be-synchronized data (Marrelli; The target process phase of the data quality analysis further determines whether the cleansing activities of the action plan (e.g., either in the source system or alignment area 124) have been performed correctly, and identifies the potential impact of actionable or problematic data (detecting the to-be-synchronized data to filter out dirty data) relative to the business or other processes that the actionable data supports. In other words, the target process phase provides an indication of the cleanliness of source data for the particular business or other processes of the target system utilizing that source data. During the target process phase, the statuses of the data records of the data attributes are updated as cleansing activities continue. This assists with prioritizing data cleansing efforts during the data migration and identifying problem areas by process domain for each source system. [0105];).  


As for claim 11, Marrelli as modified by Bentkofsky and Gao teaches the method, of claims 1, wherein the source database is a relational database, and the target database is a relational database (Marrelli; The system may employ any number of any conventional or other databases, data stores or storage structures (e.g., files, databases, data structures, data or other repositories, etc.) to store information (e.g., cleansing data, transformation data, matrices, data quality metric scores, data from the source systems, data models, etc.). The database and metadata repository may be implemented by any number of any conventional or other databases (database is a relational database), data stores or storage structures (e.g., files, databases, data structures, data or other repositories, etc.) to store information (e.g., cleansing data, transformation data, matrices, data quality metric scores, data from the source systems, data models, business or other metadata, mappings, etc.). The database and/or metadata repository may be included within or coupled to the server, source, target, and/or client systems. The database and/or metadata repository may be remote from or local to the computer or other processing systems, and may store any desired data (e.g., cleansing data, transformation data, matrices, data quality metric scores, data from the source systems, data models, business or other metadata, mappings, etc.). [0133];).  

As for claim 13 , Marrelli as modified by Bentkofsky and Gao teaches the apparatus of claim 12, configured to execute the set of instructions to cause the data synchronization apparatus to perform: controlling the synchronization process according to a preset processing rule if the total amount of dirty data reaches has reached the fault tolerant threshold (Marrelli; The target process phase of the data quality analysis further determines whether the cleansing activities of the action plan (e.g., either in the source system or alignment area 124) have been performed correctly, and identifies the potential impact of actionable or problematic data relative to the business or other processes that the actionable data supports. In other words, the target process phase provides an indication of the cleanliness of source data for the particular business or other processes of the target system utilizing that source data (continuing with synchronization in response to the total amount of dirty data not reaching the threshold). During the target process phase, the statuses of the data records of the data attributes are updated as cleansing activities continue. This assists with prioritizing data cleansing efforts during the data migration and identifying problem areas by process domain for each source system. [0105];). The motivation to combine is the same as previously presented. 
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES E HEFFERN whose telephone number is (571)272-9605.  The examiner can normally be reached on Monday - Friday, 6:30 am - 3 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached on 571-270-5626.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  


/BORIS GORNEY/Supervisory Patent Examiner, Art Unit 2158                                                                                                                                                                                                        



/J.E.H/Examiner, Art Unit 2158