:::: MENU ::::

Posts Categorized / High Availability

  • Sep 19 / 2017
  • 0
dbDigger, High Availability, Replication, SQL Server tools, SQL Server Training

SQL Server Replication refresher

During phase of training material preparation, a brief overview of replication architecture is with me. In coming lines i am going to put it in this post as a good starting point for replication learners.


Replication is a way of distributing data automatically from a source database to one or more recipient databases. It is built to sync the selected objects of a database not whole DB.  It encompasses set of objects in a database whose data is required to keep sync at subscriber databases. Replication architecture is not part of the SQL Server database engine, but a separate feature or application. It works through agents, triggers and scheduled jobs specific for this purpose. Replication can have uses in high availability systems or in a distributed system and can have different toplogies.

There are 3 main types of replication

  • Transactional Replication (one or two way)
  • Merge Replication
  • Snapshot Replication

Transactional Replication

This type of SQL Server replication is used for near real time or short delayed synchronization from publisher and subscriber. Initially a snapshot process synchronizes the subscribers with the publisher. All committed transactions on the publisher are then propagated to the subscribers in sequence. It may be one way transactional replication or two way.

In one way replication, modifications from publisher are synced on subscriber but no modification from subscriber is propagated back to publisher. While in case of peer to peer or two way replication, changes are propagated both ways and data remains sync on both sides for selected articles.

There are two options while propagating changes back from subscriber to publisher.

  • Immediate updating
  • Queued updating

Both methods can be used at the same time to maximize the availability and efficiency.

Immediate updating

In case of immediate updating changes made at the subscriber are synced to the publisher using a two-phase commit process managed by Microsoft Distributed Transaction Coordinator (MS DTC). Changes made at the subscriber must be made to both systems or are rolled back. This method is suitable for reliable, high-speed connection between the Publisher and Subscriber.

Queued updating

This method is suitable for less reliable connections. Writes updates at the subscriber to a queue. Publisher reads the queue and applies changes. This flow might result in conflicts due to latency.


Peer to Peer replication

Peer to peer replication was introduced in SQL Server 2005 and only supported in Enterprise edition of SQL Server version. It resembles in some features features with two way transactional replication. Main difference is that peer to peer replication involves more than just single publisher and associated subscriber nodes. It can span on multiple nodes and each will be kept in Sync. Also peer to peer replication is better while propagating the DDL of articles and locking mechanism.

Merge Replication

Merge replication is relatively complex type of replication. It does not use transactions but a set of defined rules used to resolve conflicts when data updated at multiple sites is synchronized. It tracks changes through triggers and metadata tables provides the means to implement part of a high-availability system. This also serves well the original purpose of serving mobile and disconnected users. It is designed for cases where the publishers are not in constant communication with the subscribers. After the initial snapshot synchronization, subsequent changes are tracked locally and the databases are merged when in contact using series of rules to resolve conflicts.


Snapshot Replication

Data changes are not tracked for snapshot replication. Each time a snapshot is applied, it completely overwrites the existing data. Snapshot replication process provides the initial synchronization for transactional and merge publications. In several cases, this initial synchronization is all that is necessary. Data is synced at defined intervals as opposed to constant transactional sync of other types.


Distribution Setup

Distribution is important setup in replication architecture. It has following components and terms.

Distribution server

Distribution server is a role where distribution setup (agent, jobs, DB) resides. Distribution server may be same as publisher or a dedicated server. For big and busy replication setups separate distributor server is recommended. We can mark a server as distributor before going to setup replication publisher and subscriber.

Distribution database

Distribution is system database for replication setup and is created automatically on server with distributor role. We can customize the DB name, files name, location and file size however. For transactional replication, the distribution database tracks changes made to publications. For merge replication it stores only synchronization history.

Distribution agent

Distribution agent is setup as a separate module and works in form of scheduled job. Distribution agent is a process that moves transactions and snapshots held in the distribution database to subscribers. Works in form of scheduled job under replication jobs category.


Replication components

There are some other terms and components that are always involved in replication setup.


Publisher is a server that makes data available for other servers, i.e. it is the server that makes data available for replication.


Publication can be a selection of tables, indexed view and stored procedures. Each publication can be replicated as a snapshot publication, a transactional publication or a merge publication. It is important to choose the publication articles on a valid base like size, relation of tables. There may be multiple publications in a single publisher.


An object included in a publication is called an article. Articles are combined on some basis to form a publication. Article can be an entire table, select rows from a table, specified columns from a table indexed view or a stored procedure. An article is a unit in publication.


Subscriber is the server that hosts the DB where data from publisher is being synced. There may be multiple subscribers against a single publisher. Subscriber may be on same server as that of publisher. Also there may be multiple subscriber databases on a single subscriber server.


Subscription relates to publication and goes to subscriber database. There may be multiple subscriptions against a single publisher and each will go to separate subscriber database. There are two types of subscriptions.

Pull Subscription

With pull subscription the subscriber asks the publisher for periodic updates. With pull subscription each subscriber can specify the best time for the update. Pull subscription is generally considered the better form of replication when a large number of subscribers are involved, or where the load on the LAN needs to be distributed.

Push Subscription

With push subscription the publisher pushes any changes made to the data out to the subscribers. Typically push subscription is used where changes need to be replicated to the subscribers shortly after the change occurs, i.e. where the replication of changes is required in near real time.

Replication Agents

As we have discussed several components of replication in above sections some agents were discussed as well. It looks suitable to list the agents so that they may give a clear picture of setup

  • Snapshot agent
  • Log Reader Agent
  • Distribution Agent
  • Merge Agent
  • Que Reader Agent

Replication effect on log file

It is important to know the relation and effects of replication on database log file. If replication falls in latency then log begins to accumulate in log file unless it is send to distributor. This may ultimately fill the log file and slow down the system in some cases. Situation is same when database is with any recovery model FULL or even SIMPLE.

Replication topologies

Topology is how the replication setup is designed and components are arranged. Following are different disgrams to give view of different toplogies.

SQL Server Replication Topology Central Distributor


SQL Server Replication Topology Central Publisher with Remote Distributor


SQL Server Replication Topology Central Publisher


SQL Server Replication Topology Central Subscriber


SQL Server Replication Topology Publishing Subscriber

  • May 06 / 2014
  • 0
CLR, dbDigger, High Availability, SQL Server Clustering, SQL Server logs

Using ‘odsole70.dll’ version ‘2009.100.1600’ to execute extended stored procedure ‘sp_OACreate’

Recently we came across unexpected cluster fail over of one of our servers. It was required to get the exact reason for it. I analyzed the sql server and windows logs. There was no traces of failure neither i found any major issue in the logs that may lead to fail over. However an entry in the logs caught my attention and it was following message

Using ‘odsole70.dll’ version ‘2009.100.1600’ to execute extended stored procedure ‘sp_OACreate’. This is an informational message only; no user action is required.

I googled it and came to know that it is culprit of event. According to its BOL page

You call some Automation procedures from a SQL Server common language runtime (CLR) object, such as sp_OACreate. In this situation, SQL Server may unexpectedly crash.

Note This issue also occurs when a CLR object calls a Transact-SQL procedure that calls Automation procedures.
It applies to SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, and SQL Server 2012. More detail can be found here.
Now we have to trace the call and modify it to avoid the accidental fail over.

  • Dec 19 / 2013
  • 0
dbDigger, Disaster Recovery, High Availability

Different HA/DR technologies available in SQL Server

In a previous post i have discussed the scope and differences of High Availability (HA) and Disaster Recovery (DR). Recalling that concept now we are going to cover the features provided in SQL Server.

SQL Server provides following levels of HA/DR

  • Object level (Tables, views, USPs)
  • Single or Multiple Database Level
  • Server Level (multiple databases with server objects and services)

Before going for a plan build by combination of different technologies we will have a brief description of each of technology.


Used to mark and distribute the data and database objects specified. Replication articles may be distributed to multiple destinations. It is considered as object level recovery solution.

Replication Pros

  • Only specified data changes are transmitted to target servers thus lowering the data travel volume
  • Replicated data provides redundancy that may be used for reporting or stand by purpose
  • Can also replicate the schema changes
  • Database can be in Simple recovery model
  • Database on secondary is both readable and writeable
  • Can choose to replicate only a subset of tables within the database
  • Since all databases are write able, you can create different indexes optimized for reporting environment

Replication Cons

  • Not suitable when whole database recovery is required rather than selective objects
  • Replication administration grows complicated when volume of data and number of DML transactions increases
  • A separate server role is required to make available and configure as distributor
  • Requires extra disk space for saving snapshots and replication transactions
  • Have to change application connection string to an available node if the node it is connected to becomes unavailable
  • Replication architecture builds an interim queue in the distribution database, causing higher latencies

Log Shipping

Log shipping remained a popular DR solution in previous versions of SQL Server. It is the process of automating transaction log backup at defined interval and automatically restores them on the standby server.

Log Shipping Pros

  • Can use compressed backups
  • Very good way to validate the transaction log backups you are already taking
  • Can run in a delayed restore mode to protect secondary from human error on primary site
  • Provides DB Level protection
  • Can have multiple secondaries
  • Very good option when other DR options are not possible or supported by a vendor since it is essentially a
  • constant transaction log backup and restore sequence
  • Stand by server is not strictly required to be same as primary server

Log Shipping Cons

  • Data loss may be experienced
  • Downtime is expected
  • Database must be in Full or Bulk Logged recovery model
  • No automatic failover
  • No automatic redirection
  • Can only failover at database granularity – cannot group a set of databases to failover together
  • Database on secondary not readable/writable during restore
  • Cross database dependencies not supported


Introduced in SQL Server 2005 and is a good alternate for log shipping as it covers some of limitation that we faced in log shipping. Mirroring also provides us the option to choose among two modes.
High-safety mode supports synchronous operation. Under high-safety mode, when a session begins, the mirror server synchronizes the mirror database with the principal database as quickly as possible. Once the databases are synchronized a committed transaction is committed on both partners, at the cost of increased transaction latency.
High-performance mode, as soon as the principal server sends a log record to the mirror server, the principal server sends a confirmation to the client, without waiting for an acknowledgement from the mirror server. This means that transactions commit without waiting for the mirror server to write the log to disk. Such asynchronous operation permits the principal server to run with minimum transaction latency, at the risk of some potential data loss.

Mirroring Pros

  • Primarily a DR solution. Can be used as HA when used with synchronous/automatic failover options.
  • Protects against I/O subsystem failure on the primary server or datacenter
  • Log stream compression available for increased performance
  • Automatic failover is possible, requires a 3rd witness server
  • Automatic redirection to secondary is possible
  • Can choose between several mirroring and failover options: synchronous or asynchronous streaming, automatic or manual failover
  • DB Level protection
  • Does not require special hardware (SAN) and cluster ware
  • Automatic repair on primary server by referencing the mirror

Mirroring Cons

  • Database must be using the FULL recovery model
  • Database on secondary is not readable/writeable
  • Additional management overhead of third witness server
  • Deprecated in SQL 2012
  • Only 1 secondary allowed
  • Can only failover at database granularity – cannot group a set of databases to failover together
  • Application connection may be modified one time to include mirror redirection
  • System databases master, tempdb, and model cannot be mirrored
  • Only one secondary database is supported with database mirroring in SQL Server 2008
  • data lost is possible in asynchronous operation mode
  • Does not support FILESTREAM
  • Cross database dependencies not supported


Automatically allows one physical server to take over the tasks and responsibilities of another physical server that has failed. It is considered as pure HA solution and does not provide data recovery or redundancy features.

Clustering Pros

  • Uses time tested Windows Clustering technology. Advanced heartbeat, failover, quorum model, and quorum voting options available in the latest versions of Windows
  • Protects against Machine failures. ex. CPU/motherboard issues
  • Protects against OS failures. ex. blue screens
  • Protects against SQL Instance failures. ex. SQL hangs
  • Application connection strings are required no change
  • Works well with most other features such as log shipping, replication, and asynchronous availability groups
  • Manages external dependencies well upon failover
  • Supports rolling upgrades scenarios
  • Instance-level protection without data redundancy
  • No special requirements with respect to database recovery models

Clustering Cons

  • Does not maintain a redundant copy of the data and so does not protect against an I/O subsystem failure
    Not a DR solution
  • Must ensure nodes are maintained properly (patching levels should match)
  • A major outage like a data center power failure or failure of the network link to the primary data center is not addressed if all nodes are within the same data center.

SAN based Replication

SAN Replication resembles the database mirroring technology as both do data replication from one location to another. However while using SAN Replication we cannot automatically fail over. SAN Replication can fail over automatically when used with a Windows Cluster and when you have software which is installed on the Windows Cluster and tells the storage array to fail over. However without a Windows Cluster in place, you will just have a server sitting waiting for the storage array.

SAN Based Replication Pros

  • Makes sure the disaster recovery through redundancy
  • Works effectively with clustering

SAN Based Replication Cons

  • Additional cost for SAN based storage
  • No automatic failover when used without clustering

SQL Server Always On Availability Groups

It’s like mirroring, but we get multiple mirrors for many more databases that we can fail over in groups, and we can query the mirror. An availability group supports a failover environment for a discrete set of user databases, known as availability databases that fail over together. An availability group supports a set of read-write primary databases and one to four sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations. It is based on Windows Server Failover Clustering (WSFC) cluster that should not be confused with the SQL Server clustering

Always-On Solution Layers

A successful SQL Server Always-On solution requires understanding and collaboration across these layers:
Infrastructure level: Server-level fault-tolerance and intra-node network communication.
SQL Server instance level. A SQL Server Always-On Failover Cluster Instance (FCI) is a SQL Server instance that is installed across and can fail over to server nodes in a WSFC cluster. The nodes that host the FCI are attached to robust symmetric shared storage (SAN or SMB).

Database level: An availability group is a set of user databases that fail over together. An availability group consists of a primary replica and one to four secondary replicas. Each replica is hosted by an instance of SQL Server (FCI or non-FCI) on a different node of the WSFC cluster.

Client connectivity: Database client applications can connect directly to a SQL Server instance network name, or they may connect to a virtual network name (VNN) that is bound to an availability group listener. The VNN abstracts the WSFC cluster and availability group topology, logically redirecting connection requests to the appropriate SQL Server instance and database replica.

Availability Groups Pros

  • HA (synchronous, local datacenter) and DR(asynchronous, remote datacenter) solution
  • Protects against I/O subsystem failure on the primary server or datacenter
  • Log Stream compression available for increased performance
  • Automatic Page Repair
  • Automatic redirection to secondary is possible via an AG listene
  • Automatic failover is possible, does not require a 3rd witness server, windows cluster is used for quorum Up to 4 secondaries
  • Can failover a set of databases as a group
  • Can choose between several mirroring and failover options: synchronous or asynchronous streaming, automatic or manual failover
  • Can offload T-Log backups on secondary
  • Database on secondary can be set as readable (Readable or READ-INTENT secondary options)

Availability Groups Cons

  • Database must be using the FULL recovery model
  • Database on secondary is not writeable
  • System databases master, tempdb, and model cannot be part of an AG
  • Cross database dependencies unsupported
  • Special configuration considerations when using Replication
  • Quorum model and node vote options should be configured to ensure an outage at the DR site will not cause the cluster to lose quorum
  • One DB fails then it failed over as a group
  • We can’t choose between Synchronous and Asynchronous modes, it been used internally based on HA or DR scenarios

If we are using SQL Server 2012 then it is the best choice for HA and DR provided the budget and infrastructure limits are OK.

Combining different features

Now we may look at few combinations derived from above mentioned technologies. One or more of these may suit our requirements and budget.

Clustering with Mirroring

Keeping in mind the PROs and CONs of both these options we know that clustering lacks the DR features while Mirroring is all about data redundancy and recovery. So combining these may lead to a very good solution in terms of both HA and DR. The only problem with this solution is that if multiple databases are involved in failover, they might not be in same consistent state at a given point of time.

Clustering with SAN Replication

Keeping in mind the pros and cons of both these options we know that clustering lacks the DR features while SAN replication is all about data redundancy and recovery. So combining these may lead to a very good solution in terms of both HA and DR

Pros of Clustering with SAN Replication

  • Protects against I/O subsystem failure on the primary datacenter
  • Data disks can be synchronously or asynchronously mirrored
  • External dependencies handled well as in typical single site failover clusters
  • Instance-level protection WITH data redundancy
  • SQL 2012 supports multi-subnet environments, eliminating the need for a VLAN

Cons of Clustering with SAN Replication

  • Additional expense of SAN replication technology
  • Additional complexity of managing and configuring SAN replication technology
  • Quorum model and node vote options should be configured to ensure an outage at the DR site will not cause the cluster to lose quorum

Availability Groups with SQL Server Clustering

Availability groups emerged as our best choice while reviewing different options in previous sections. It is based on Windows Server Failover Clustering (WSFC) cluster and we may add SQL Server clusters to make sure the SQL Services availability in case of any outage or disaster. It would be a second layer of failover. AlwaysOn Availability Groups does not depend on any form of shared storage. However, if you use a SQL Server failover cluster instance (FCI) to host one or more availability replicas, each of those FCIs will require shared storage as per standard SQL Server failover cluster instance installation.

Always-On Availability Groups

It helps ensure availability of application databases, and they enable zero data loss through log-based data movement for data protection without shared disks.
Availability groups provide an integrated set of options including automatic and manual failover of a logical group of databases, support for up to four secondary replicas, fast application failover, and automatic page repair.

Always-On Failover Cluster Instances (FCIs)

Enhance the SQL Server failover clustering feature and support multisite clustering across subnets, which enables cross-data-center failover of SQL Server instances. Faster and more predictable instance failover is another key benefit that enables faster application recovery.

  • Dec 19 / 2013
  • 0
DBA Interview questions, dbDigger, Disaster Recovery, High Availability

High Availability VS Disaster Recovery

Most of the times both these terms are used interchangeably however we should have a clear distinction of these in mind. Particularly while dealing with the disaster recovery (DR) and high availability (HA) scenarios.

One word difference
Major difference lies in one word i.e. data. HA does not involve data recovery aspects while DR do as name indicates.

High availability (HA)

  • It is the system’s ability to remain accessible in the event of a system component failure
  • It involves multiple levels of fault tolerance and/or load balancing capabilities into a system
  • It does not include the redundancy or data recovery abilities
  • Service downtime does not exist or is in seconds

Disaster Recovery (DR)

  • It is the process by which a system is restored to a previous acceptable state, after a natural or man-made disaster
  • DR involves recovery of data and services within some down time
  • Downtime depends upon the option used along with the HA involved
  • For DR, measurable characteristics, such as Recovery Time Objective (RTO) and Recovery Point Objective (RPO) drive our DR plan
  • Aug 09 / 2008
  • 0
DBA best practices, dbDigger, High Availability

DBA Best Practices for General High Availability

Due to growing data dependency of organizations, high availability is imperative. While implementing any configuration for high availability follow following base lines.

  1. Physically protect your SQL Servers from unauthorized users.
  2. Physically document all of your SQL Server instances. Incorporate effective change management.
  3. Always use a RAIDed array or SAN for storing your data.
  4. Use SQL Server clustering, database mirroring, or log shipping to provide extra fault tolerance.
  5. Replication is not an effective means to protect your data.
  6. Ensure that your entire IT infrastructure is redundant. It is only as strong as its weakest link.
  7. Always use server-class hardware, and standardize on the same hardware as much as possible.
  8. Use hardware and software monitoring tools so you can quickly become aware of when problems first arise.
  9. After testing, apply all new service packs and hot fixes to the OS and SQL Server.
  10. Cross-train staff so that there are multiple people who are able to deal with virtually any problem or issue.

Chosen from SQL Server DBA best practices By Brad M.mcGehee

  • Aug 09 / 2008
  • 0
DBA best practices, dbDigger, High Availability, Replication

DBA Best Practices for Replication

Following best practices will help you to configure a successful replication

  1. Replication needs should be clearly defined before creating a replication topology. Successful replication can be difficult and requires much pre-planning.
  2. Ideally publishers, distributors, and subscribers should be on separate physical hardware.
  3. Create, document, and test a backup and restore strategy. Restoring replicated databases can be complex and requires much planning and practice.
  4. Script the replication topology as part of your disaster recovery plan so you can easily recreate your replication topology if needed.
  5. Use default replication settings, unless you can ensure that a non-default setting will actually improve replication performance or other issues. Be sure that you test all changes to ensure that they are as effective as you expect.
  6. Fully understand the implications of adding or dropping articles, changing publication properties, and changing schema on published databases, before making any of these changes.
  7. Periodically, validate data between publishers and subscribers.
  8. Regularly monitor replication processes and jobs to ensure they are working.
  9. Regularly monitor replication performance, and performance tune as necessary.
  10. Add alerts to all replication jobs so you are notified of any job failures.

Chosen from SQL Server DBA best practices By Brad M.mcGehee

Consult us to explore the Databases. Contact us