Home  |  Articles  |  Scripts  |  Consultancy  |  FAQs  |  Links  |  SSIS

 

Replication Across Non-Trusted Domains or Using the Internet

Replication over the internet or across non-trusted domains is usually performed using a virtual private network (VPN), and consequently the configuration is much the same as that used on a LAN. This article outlines what to do if such a VPN is not available. While it's true that in almost all cases you wouldn't consider making SQL Server available directly to the internet, loosely -coupled networks in a hosted environment are commonplace. These systems ensure integrity by the use of firewall rules, resulting in DMZ/ECZ layers and causing serious issues for the would-be replicator. This is very much a 'how-to' guide, as the intricacies of network connectivity setup would otherwise make the article too long.
In replicating over non-trusted networks, the main additional considerations over and above a normal setup are:

(a) when configuring the subscriber, the publisher will not initially be visible in enterprise manager, and cannot be added in the usual way.

(b) after the snapshot has been created, it is normally accessed by the merge/distribution agent using the repldata share, which will be unavailable over the internet.

(c) the replication agents normally run under the context of a windows user account and are impersonated at the distributor/publisher, however an AD lookup at the publisher when accessing the snapshot files will not be able to find the windows user.

What follows is a list of the configuration changes necessary to solve these issues.

(i) Network Configuration - Ports etc

SQL communication needs to be enabled on the port defined at the publisher. For a default instance this is typically port 1433 while FTP uses port 21 by default. So, rules allowing the subscriber's IP address access on these ports through the firewall need to be set up, implying that the subscriber must not get a leased DHCP IP address, but have a fixed address of its own. In SQL Server 2005 merge replication is possible over port 443 (SSL/HTTP) - more on that in another article - but for our case, the security is provided by the use of logins and passwords, firewall rules, and encryption if necessary.

(ii) Publication Configuration

The repldata share on the publisher that is normally used to hold the snapshot initialization files will be unavailable - discounting the use of a guest windows login at the publisher which is simply too insecure a method to consider - so an alternative initialization method is needed. The main choices are automatic FTP and alternative snapshot locations at the subscriber. For the latter, when a pull subscription is created, it can be pointed at a local directory for the snapshot files. I often use this method as my snapshot files are so large (>40GB) that I save a lot of time doing the whole thing manually. Basically, this involves the creation of the snapshot, compressing the directory using WinZip 9.0, manually FTPing to the subscriber then unziping locally. After that I use the alternative snapshot location (@alt_snapshot_folder) on the subscriber to initialize. This method is used because although compression using CAB files is available within the publication wizard, it is limited to a maximum size of 2GB.


This article however will concentrate on the inbuilt, automatic method of FTP which is configured at the publisher on the snapshot location tab. If all the subscribers need to be set up in the same way (non-trusted domains etc) then on this tab only the second checkbox is selected at the top. The snapshot folder lies under the FTPROOT directory and 'Subscribers can access this folder using FTP' obviously needs to be selected. If you do choose to compress, the files will be uncompressed using the Temp folder of the subscriber, so you must check that you have enough space. The client path is the path from the FTP root directory, so in this case is \TRSnapshot\ftp. The login can be left as the default for anonymous access (not recommended) or a valid windows login at the publisher is selected that has read access to the FTPROOT directory – PaulI in this case.

(iii) alias configuration at subscriber

In order for the subscriber to 'see' the publisher, an alias must be created in the client network utility before registering the publisher in Enterprise Manager. This is mandatory as the use of an IP address in Enterprise Manager will later result in various errors. Be sure to:
(a) use TCP/IP and not named pipes, otherwise you'll get a "SQL server does not exist or access is denied." error.
(b) create an alias that has exactly the same name as the publisher/distributor.


(iv) hosts file configuration at the subscriber

The FTP request at the subscriber needs to be able to associate the publisher/distributor's NETBIOS name with its IP address. As we have a non-trusted environment, this entry won't exist on the DNS on the subscribers’s LAN, so we manually add it to the HOSTS file on the subscriber, located in C:\WINNT\system32\drivers\etc. After the hashed out section, the file should look like the text below, where DOHXXX-SQL is the Publisher/Distributor's netbios name.

127.0.0.1 localhost
62.6.139.105 DOHXXX-SQL


If this is omitted, the error message will be the FTP initialization error : “Message: The process could not connect to FTP site 'DOHCOL-SQL' using port 21”.

(v) Create a PULL Subscription

When using the Pull subscription wizard, the ‘Choose Publication’ form allows browsing of the publisher’s publications. The publisher can be added to EM at this stage or already be configured. If everything has been correctly set up using the previous steps, the publications list should be visible. The synchronization agent login should be set to use a SQL login (pass-through is possible but let’s leave that for now) already added to the PAL. If the subscriber is registered at the publisher, this can be a named subscription, otherwise it’ll be anonymous. The only other thing to set up is to select the option to ‘Download using FTP’ on the Snapshot Delivery form.