|
|
This article looks at various scenarios where new data or data changes can "disappear" in merge replication. In each case this is by design according to the rules of the merge replication engine but if you are unaware of these scenarios you might think that there is something wrong with your setup, and perhaps even lose confidence in the technology. So, it is very important when designing your merge replication topology to understand each of the scenarios below. There may turn out to be others which I have missed and if this is the case please email me details and I'll add them in. (1) "Normal" ConflictsSuppose one subscriber deletes a record and another one updates the same record. By default the first subscriber to synchronize will "win" the conflict. So, if the delete statement wins, the updated record will disappear and in such cases the data will be removed as a result of a straightforward conflict and nothing more arcane. In trying to understand why the data has disappeared, we must always check 2 GUIs: the Replication Monitor and the Conflict Viewer to first try to understand what has happened. Assuming we have a HistoryVerboseLevel of 2 in the profile, we'll see the detailed log of each table in the Replication Monitor and notice that there have been conflicts. The Conflict Viewer will record the conflict details as below.
We can track the cause of such conflicts by looking at the MSmerge_conflicts_info table which gives the details of the cause of the conflict and the actual row identifier. The reason_text summarises the situation nicely: "The same row was updated at 'ServerX.testsub' and deleted at 'ServerY.AdventureWorks'. The resolver chose the delete as the winner." (2) Conflicts Which Cannot be ResolvedSuppose we have the publisher adds PK = '1' and a subscriber
adds PK = '1' and then they synchronize. In the case of a conflicting PK
value, the conflict will be raised and the data not converged.
In the replication monitor the uploaded and downloaded record
will both register a conflict. Such a conflict
cannot
be resolved using the conflict viewer - submitting either the
winner or the loser won't work. So, you'll have to manually
update the PK value at a particular replica so as to remove the
cause of conflict or just delete one of the records. After that
you can sync up again in order to allow for convergence. By the
way for this particular type of
conflict in Profiler I notice that it regenerates a conflict on
each synchronization. After attempting and failing the insert,
there is a subsequent call to sp_MSsetgentozero which
resets the row and then it gets picked up once again for the
next sync. Let me reiterate here that your application must
provide code which reacts appropriately once such conflicts are
raised. This applies to all the conflicts which are not resolved
by a winner (conflict_type = 5 to 10 inclusive in
the MSmerge_conflicts_info
table). (3) Compensating ChangesWhen the merge agent synchronizes it might be the case that an insert, update or a delete cannot be applied at a destination replica. This could be because of a particular constraint or trigger e.g. a check constraint violation at the replica. The merge agent then may send a compensating change back to the source replica to undo the failed change i.e. delete the source record in the case of a failed insert. This all occurs in the same sync session. There is an option which came into being in a service pack (I forget which) of SQL Server 2000 which controls this behaviour by setting the @compensate_for_errors parameter for sp_addmergearticle and this has been well-documented. In the above scenario a delete would be automatically sent to one of the subscribers during the same synchronization session. You might now wonder "how do I know I have 'experienced' a compensating change"? Good question! This is not so straightforward. There will be a conflict registered in the conflict viewer. In the case of an insert being compensated by a delete, the conflict viewer will show "A row insert at 'YourServername.testsub' could not be propagated to 'YourServername.AdventureWorks'. This failure can be caused by a constraint violation. ...". Unfortunately this message is not particular to compensating changes so we won't know this is the case explicitly. I'm looking into ways to provide a query which can determine that compensating changes have occurred and will add it to this article if I succeed. (4) Partition RealignmentSuppose we have three tables - Region, Company and Orders. These are related by PK-FK relationships such that Orders belong to Companies and Companies belong to Regions. We define dynamic partitions such that Subscriber "Sub_North" gets the North Region records and subscriber "Sub_South" gets the "South" records, by using a reference to HOST_NAME() in the filter and adding join filters to the FK tables:
The fact that there is a join filter defined in the publication between Regions and Companies and Orders means that Sub_North will only receive Companies and Orders from the North. Now, suppose we change a company from the North to the South region....what happens? Well first of all the Company data is realigned. Subscriber Sub_South will now also get all the associated orders. While subscriber North will have the Company deleted and all the associated orders will be deleted. If you are new to merge replication this might catch you by surprise. Actually it is quite logical, but we have to be aware that a change near the top of a hierarchy can result in the same amount of traffic as a reinitialization! Such deletes are visible in the Replication Monitor but there
are no conflicts recorded in the conflict viewer. As with (3) we
have no easy way of definitively determining that partition
re-alignment is the cause of the deletes! (5) Join Filters which "oppose" Foreign Keys (1 inserts)I wasn't aware of this issue until recently when a client showed me some research done by Steve Wright and James Rowland Jones. As it turned out it wasn't relevant to my particular client but I was aware that this is definitely something to watch out for. Articles are processed in a certain order which is well-documented. However in the case of merge replication with filters, things can go awry. In the example below, the Sales data will be processed prior to the Company and Region data because the ordering will be derived according to the order of the join filters. Although it's rare, sometimes it makes logical sense to have the join filters run contrary to the PK-FK order.
In the above example, suppose for the North subscriber a company is added to the North region and also an associated Northern sale is added to this company. We then synchronize. We would expect the subscriber who has the defined "North" filter to keep the new sales data. However the views which evaluate the partition contents will not allow sales which don't belong to the partition, and the sale data will be processed prior to the company because of the ordering of the join filters, so the merge agent ensures a corresponding delete will be sent to the subscriber. You can avoid this by having the filter order correspond to the PK-FK order but might not be the correct order for your business requirements. The foolproof method is to explicitly set the order using the @processing_order parameter of sp_addmergearticle and this'll override the derived order. Plenty more work to do, and this is not supported in SSMS so it's back to scripting. (5) Join Filters which "oppose" Foreign Keys (2 deletes)Please see this article for more info. (7) Merge users not in the PALFellow MVP James Rowland-Jones did some nice research on this section which I've linked to in the title. Basically if the executing user is not in the PAL and doesn't have the correct permissions on the table, the merge view which determines what is replicated will filter out the record, so we have non-convergence. The permissions check is done between the user and the table, so having EXEC rights on a stored proc that inserts into the table won't be sufficient. (8) Using Joins with Multiple FiltersI'm sure filters and joins are where the complication and confusion of merge lies. Here is another example for this article: if you have a join filter between 2 tables, and a different static table filter on each table in the join, the results are not filtered correctly, or at least not at all intuitively. E.G. consider 2 tables: Customers and Orders. These are joined by a PK - FK relationship on CustomerID so that as we'd expect a customer can place several orders. If I create a join filter to link the tables, all 'ordering' Customers and their associated Orders are replicated. I then add a filter clause on the Customers table (parent) to restrict the output to a particular customer: CustomerID = 'Joe'. The resulting (automatically created) merge view for Orders restricts the results to the orders belonging to customer 'Joe'. So, on initialization one customer and his associated orders are replicated. However, what if I add another filter now to the Orders table: CustomerID = 'Harry'? Logically this should further restrict the replicated orders and we'd expect zero records returned. However, on initialization we find that more orders are now replicated! In this case the merge view is created which UNIONs the 2 clauses, rather than using a logical AND clause! This is quite different to the above scenarios - we now observe extra rows rather than seeing rows unexpectedly deleted. (9) Adding New Records without firing the Merge Insert TriggerThis can happen in 2 circumstances:
|
Paul Ibison, Copyright © 2011 |