SQL Server 2005 BCP Partitioning
In SQL Server 2000 when an article is
BCPd to the file system (distribution working folder) during the snapshot
generation, there is always just one file used for the data. Conversely, in
SQL Server 2005 when you look in the distribution working folder after
creating a snapshot you might be surprised to find many such files for each
article, each containing a part of the table data.
Clearly there has been a big change in the processing rules, which is not documented in BOL in any detail. I'll refer to this overall process as "BCP Partitioning" - getting the term from developers posting in the Microsoft Replication Newsgroup. This article explains why BCP Partitioning exists, what to expect to occur and how to troubleshoot if it all goes wrong.
Why was it created?
There are several benefits of BCP Partitioning. Firstly, when the snapshot is being applied to the subscriber, there might be a network outage. In SQL Server 2000 this would mean that the complete snapshot would need to be reapplied, and in the case of "concurrent snapshot" this will be all in one transaction. However, if you have a SQL Server 2005 distributor and SQL Server 2005 subscribers there is now a much greater granularity in the process. Each partition is applied in a separate transaction, meaning that after an outage the snapshot distribution is able continue at the partition level where it left off and complete just the remaining partitions. For a table containing a lot of rows this could lead to a huge saving in time. Other useful side-effects are that this can cause less expansion of the transaction log (assuming the migration crosses a backup schedule or we're using the simple recovery model) and it can lead to paths of parallel execution of the BCP process for those machines having > 1 processor (it's true that parallel execution existed in SQL Server 2000, but this was only for several articles and not for a single table). Similarly, the same benefits apply when creating the initial snapshot using the snapshot agent.
To test the number of files produced and to try to investigate the algorithm, I created a simple table (below) and populated it.
CREATE TABLE TestBCP(id int not null,
declare @id int
On my machine I initially found a simple dependency on the number of rows which on further investigation was completely unrepeatable! ie the number of batch files seemed to vary with the size of the table in unpredictable ways. Quite why this occurred is fully explained in the troubleshooting section.
(a) For 8 processors or less, the formula used to calculate the number of partitions is:
So, in my case as I have 4 processors, I would expect there to be 16 partitions which is exactly what I saw. Even for 1 processor there could be 4 partitions, which reinforces the fact that multithreading is only part of the reason behind the algorithm. Note that the –BcpBatchSize parameter of the snapshot and distribution agents simply governs how often progress messages are logged and has no bearing at all on the number of partitions.
(b) there is a threshold of a minimum of 1000 rows after which partitioning occurs.
(c) the data was found to be evenly distributed amongst the partitions (data files).
(a) "why is the number of partitions incorrect"?
In many cases, as mentioned above, what I
found wasn't in accordance with the 3 rules. In some cases I had expected to
see many partitions and all the data was BCPd to just one file and in others
I only had <1000 rows although many partitions were created, most of which
In reading the stats output above the "Rows" total should be identical to that returned from "select count(*) from testbcp". Running UPDATE STATISTICS testbcp in all cases resulted in these totals being the consistent and subsequently the number of partitions created was in agreement with the 3 rules.
(b) Disabling BCP Partitioning
To disable BCP Partitioning, you can add the unofficial "-EnableArticleBcpPartitioning 0" switch to the snapshot agent as shown below and a single datafile will be produced, just like in SQL Server 2000:
Why would you want to turn off such a useful feature? Well, anecdotally, things may get worse for folks don’t start off with empty tables (archiving or roll-up scenarios) or if they use concurrent snapshot (default for SQL2005) and any or all of CPU, disk I/O, and network bandwidth can be the bottleneck in the attempt to extract more snapshot processing throughput using BCP partitioning.
(c) Ensuring Bulk-Logging
For those tables which really expand the transaction log, some DBAs like to enable the bulk-logged recovery mode to minimise logging, but this won't always work when we are dealing with multiple partitions. To ensure that there is a maximum chance of going down the bulk-logged path you should use -MaxBcpThreads > 1 for the distribution agent and ensure that the target table doesn't have any indexes on it before the distribution agent delivers the snapshot or just use use -MaxBcpThreads = 1 to turn off parallelism although the latter option obviously might reduce performance.
The new BCP Partitioning functionality offers many new improvements - initialization is generally faster and certainly more resilient. Mostly we'd not even need to know that this mechanism had changed in SQL Server 2005 and simply accept that somehow the system seems improved and faster, but hopefully this article shed a little more light on the issues going on behind the scenes and will help if troubleshooting is ever called for.
Paul Ibison, Copyright Â© 2013