How to check all publications and all articles for non-convergenceIntroductionIn replication, whether it is Merge or Transactional, we assume that the data changes are recorded by the replication setup and that we fully understand the data flow. However, in transactional replication at some stage we might find a constraint violation when synchronizing, meaning that someone has edited the subscriber data without us realising and in merge replication there are sometimes changes made to the data which don’t fire the replication triggers. In each case we end up with non-convergence of data and really we want to know about this well before it causes a problem. One way we can proactively test for this is by using “validation” of the agents. This has its benefits but we might want to do a system-wide check without generating lengthy scripts and we would like to fully control when this takes place. This article looks at using the TABLEDIFF command-line utility to produce a report of data comparisons. I have already done an article explaining how to use the TABLEDIFF utility and listing all the various parameters it takes, so I won't cover all this basic info again. Basically I run the utility below once a week and then examine the report to look for any discrepancies. It will find all the publications, subscriptions and articles and do a full comparison for me. By changing the argument values for the TABLEDIFF command, we can decide whether to use locking of the table during the comparison and if we want to report on the erroring rows. Generation of TABLEDIFF CodeThe procedure used to generate the arguments
for TABLEDIFF is listed below. It generates a list of TABLEDIFF
command-line arguments to examine an instance of SQL Server and
compare all the articles in all the publications to all the
subscribers. Not all the relevant info for this proc can be
gained from the distribution database (eg the "destination
owner" is not always correctly populated in dbo.MSArticles) so I
have to find the published databases and then iterate through
them using dynamic SQL. The procedure has to use dynamic SQL
because the database context can’t be changed using “USE” from
within a stored procedure, but there are no arguments being
passed in so it’s not really a worry to me in terms of SQL
injection etc. DECLARE @publicationdatabase VARCHAR(500) SSIS Package to run TABLEDIFFsSo, once the list of commands is returned, how can we use it? Well, for each record in the resultset, we want to run a command shell. This sounds rather like a programming task and you might assume that this means learning C# but no fear – SSIS is quite capable of this sort of thing. The SSIS Control Flow is shown below. If you’re not too familiar with this sort of thing, is is not too complex and I'll break it down bit by bit.
To start off with, we run the code above and we need somewhere to put the resulting dataset. To run a script like this we use an ExecuteSQL task and specify that the resultset goes into a variable (called myRecordset below) which is a variable having the Object datatype.
Next we want to iterate through this resultset and read each row’s value into a variable. This is what the ForEach Loop does. It enumerates over myRecordset and uses myArgument mapped to Index (0) – the first column.
For each variable value in myArgument we want to run the TABLEDIFF utility. For this we use the Execute Process Task and configure the path to the executable. The command window needs to be Hidden and needs to not fail if it encounters an error – we want it to complete all comparisons regardless of any errors encountered.
Finally, the expressions section is used to feed in the value of MyArgument to the “Arguments” of the command executable.
ConclusionsSo that’s about it. Once it’s all set up, you schedule it to run on a Sunday night and check the results each Monday morning, which is pretty much what I do. If you want to make it more sophisticated,
you could parse the results file and chop out all the matching
rowcount messages so we are left with just the relevant (nonmatching)
part – you can’t do this in the TABLEDIFF arguments but it
shouldn’t be too difficult to do in Powerscript or by reading
into a SQL table and TSQL to do the deletes. |
Paul Ibison, Copyright © 2010 |