:::: MENU ::::

Posts Categorized / Disaster Recovery

  • Dec 19 / 2013
  • 0
dbDigger, Disaster Recovery, High Availability

Different HA/DR technologies available in SQL Server

In a previous post i have discussed the scope and differences of High Availability (HA) and Disaster Recovery (DR). Recalling that concept now we are going to cover the features provided in SQL Server.

SQL Server provides following levels of HA/DR

  • Object level (Tables, views, USPs)
  • Single or Multiple Database Level
  • Server Level (multiple databases with server objects and services)

Before going for a plan build by combination of different technologies we will have a brief description of each of technology.

Replication

Used to mark and distribute the data and database objects specified. Replication articles may be distributed to multiple destinations. It is considered as object level recovery solution.

Replication Pros

  • Only specified data changes are transmitted to target servers thus lowering the data travel volume
  • Replicated data provides redundancy that may be used for reporting or stand by purpose
  • Can also replicate the schema changes
  • Database can be in Simple recovery model
  • Database on secondary is both readable and writeable
  • Can choose to replicate only a subset of tables within the database
  • Since all databases are write able, you can create different indexes optimized for reporting environment

Replication Cons

  • Not suitable when whole database recovery is required rather than selective objects
  • Replication administration grows complicated when volume of data and number of DML transactions increases
  • A separate server role is required to make available and configure as distributor
  • Requires extra disk space for saving snapshots and replication transactions
  • Have to change application connection string to an available node if the node it is connected to becomes unavailable
  • Replication architecture builds an interim queue in the distribution database, causing higher latencies

Log Shipping

Log shipping remained a popular DR solution in previous versions of SQL Server. It is the process of automating transaction log backup at defined interval and automatically restores them on the standby server.

Log Shipping Pros

  • Can use compressed backups
  • Very good way to validate the transaction log backups you are already taking
  • Can run in a delayed restore mode to protect secondary from human error on primary site
  • Provides DB Level protection
  • Can have multiple secondaries
  • Very good option when other DR options are not possible or supported by a vendor since it is essentially a
  • constant transaction log backup and restore sequence
  • Stand by server is not strictly required to be same as primary server

Log Shipping Cons

  • Data loss may be experienced
  • Downtime is expected
  • Database must be in Full or Bulk Logged recovery model
  • No automatic failover
  • No automatic redirection
  • Can only failover at database granularity – cannot group a set of databases to failover together
  • Database on secondary not readable/writable during restore
  • Cross database dependencies not supported

Mirroring

Introduced in SQL Server 2005 and is a good alternate for log shipping as it covers some of limitation that we faced in log shipping. Mirroring also provides us the option to choose among two modes.
High-safety mode supports synchronous operation. Under high-safety mode, when a session begins, the mirror server synchronizes the mirror database with the principal database as quickly as possible. Once the databases are synchronized a committed transaction is committed on both partners, at the cost of increased transaction latency.
High-performance mode, as soon as the principal server sends a log record to the mirror server, the principal server sends a confirmation to the client, without waiting for an acknowledgement from the mirror server. This means that transactions commit without waiting for the mirror server to write the log to disk. Such asynchronous operation permits the principal server to run with minimum transaction latency, at the risk of some potential data loss.

Mirroring Pros

  • Primarily a DR solution. Can be used as HA when used with synchronous/automatic failover options.
  • Protects against I/O subsystem failure on the primary server or datacenter
  • Log stream compression available for increased performance
  • Automatic failover is possible, requires a 3rd witness server
  • Automatic redirection to secondary is possible
  • Can choose between several mirroring and failover options: synchronous or asynchronous streaming, automatic or manual failover
  • DB Level protection
  • Does not require special hardware (SAN) and cluster ware
  • Automatic repair on primary server by referencing the mirror

Mirroring Cons

  • Database must be using the FULL recovery model
  • Database on secondary is not readable/writeable
  • Additional management overhead of third witness server
  • Deprecated in SQL 2012
  • Only 1 secondary allowed
  • Can only failover at database granularity – cannot group a set of databases to failover together
  • Application connection may be modified one time to include mirror redirection
  • System databases master, tempdb, and model cannot be mirrored
  • Only one secondary database is supported with database mirroring in SQL Server 2008
  • data lost is possible in asynchronous operation mode
  • Does not support FILESTREAM
  • Cross database dependencies not supported

Clustering

Automatically allows one physical server to take over the tasks and responsibilities of another physical server that has failed. It is considered as pure HA solution and does not provide data recovery or redundancy features.

Clustering Pros

  • Uses time tested Windows Clustering technology. Advanced heartbeat, failover, quorum model, and quorum voting options available in the latest versions of Windows
  • Protects against Machine failures. ex. CPU/motherboard issues
  • Protects against OS failures. ex. blue screens
  • Protects against SQL Instance failures. ex. SQL hangs
  • Application connection strings are required no change
  • Works well with most other features such as log shipping, replication, and asynchronous availability groups
  • Manages external dependencies well upon failover
  • Supports rolling upgrades scenarios
  • Instance-level protection without data redundancy
  • No special requirements with respect to database recovery models

Clustering Cons

  • Does not maintain a redundant copy of the data and so does not protect against an I/O subsystem failure
    Not a DR solution
  • Must ensure nodes are maintained properly (patching levels should match)
  • A major outage like a data center power failure or failure of the network link to the primary data center is not addressed if all nodes are within the same data center.

SAN based Replication

SAN Replication resembles the database mirroring technology as both do data replication from one location to another. However while using SAN Replication we cannot automatically fail over. SAN Replication can fail over automatically when used with a Windows Cluster and when you have software which is installed on the Windows Cluster and tells the storage array to fail over. However without a Windows Cluster in place, you will just have a server sitting waiting for the storage array.

SAN Based Replication Pros

  • Makes sure the disaster recovery through redundancy
  • Works effectively with clustering

SAN Based Replication Cons

  • Additional cost for SAN based storage
  • No automatic failover when used without clustering

SQL Server Always On Availability Groups

It’s like mirroring, but we get multiple mirrors for many more databases that we can fail over in groups, and we can query the mirror. An availability group supports a failover environment for a discrete set of user databases, known as availability databases that fail over together. An availability group supports a set of read-write primary databases and one to four sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations. It is based on Windows Server Failover Clustering (WSFC) cluster that should not be confused with the SQL Server clustering

Always-On Solution Layers

A successful SQL Server Always-On solution requires understanding and collaboration across these layers:
Infrastructure level: Server-level fault-tolerance and intra-node network communication.
SQL Server instance level. A SQL Server Always-On Failover Cluster Instance (FCI) is a SQL Server instance that is installed across and can fail over to server nodes in a WSFC cluster. The nodes that host the FCI are attached to robust symmetric shared storage (SAN or SMB).

Database level: An availability group is a set of user databases that fail over together. An availability group consists of a primary replica and one to four secondary replicas. Each replica is hosted by an instance of SQL Server (FCI or non-FCI) on a different node of the WSFC cluster.

Client connectivity: Database client applications can connect directly to a SQL Server instance network name, or they may connect to a virtual network name (VNN) that is bound to an availability group listener. The VNN abstracts the WSFC cluster and availability group topology, logically redirecting connection requests to the appropriate SQL Server instance and database replica.

Availability Groups Pros

  • HA (synchronous, local datacenter) and DR(asynchronous, remote datacenter) solution
  • Protects against I/O subsystem failure on the primary server or datacenter
  • Log Stream compression available for increased performance
  • Automatic Page Repair
  • Automatic redirection to secondary is possible via an AG listene
  • Automatic failover is possible, does not require a 3rd witness server, windows cluster is used for quorum Up to 4 secondaries
  • Can failover a set of databases as a group
  • Can choose between several mirroring and failover options: synchronous or asynchronous streaming, automatic or manual failover
  • Can offload T-Log backups on secondary
  • Database on secondary can be set as readable (Readable or READ-INTENT secondary options)

Availability Groups Cons

  • Database must be using the FULL recovery model
  • Database on secondary is not writeable
  • System databases master, tempdb, and model cannot be part of an AG
  • Cross database dependencies unsupported
  • Special configuration considerations when using Replication
  • Quorum model and node vote options should be configured to ensure an outage at the DR site will not cause the cluster to lose quorum
  • One DB fails then it failed over as a group
  • We can’t choose between Synchronous and Asynchronous modes, it been used internally based on HA or DR scenarios

If we are using SQL Server 2012 then it is the best choice for HA and DR provided the budget and infrastructure limits are OK.

Combining different features

Now we may look at few combinations derived from above mentioned technologies. One or more of these may suit our requirements and budget.

Clustering with Mirroring

Keeping in mind the PROs and CONs of both these options we know that clustering lacks the DR features while Mirroring is all about data redundancy and recovery. So combining these may lead to a very good solution in terms of both HA and DR. The only problem with this solution is that if multiple databases are involved in failover, they might not be in same consistent state at a given point of time.

Clustering with SAN Replication

Keeping in mind the pros and cons of both these options we know that clustering lacks the DR features while SAN replication is all about data redundancy and recovery. So combining these may lead to a very good solution in terms of both HA and DR

Pros of Clustering with SAN Replication

  • Protects against I/O subsystem failure on the primary datacenter
  • Data disks can be synchronously or asynchronously mirrored
  • External dependencies handled well as in typical single site failover clusters
  • Instance-level protection WITH data redundancy
  • SQL 2012 supports multi-subnet environments, eliminating the need for a VLAN

Cons of Clustering with SAN Replication

  • Additional expense of SAN replication technology
  • Additional complexity of managing and configuring SAN replication technology
  • Quorum model and node vote options should be configured to ensure an outage at the DR site will not cause the cluster to lose quorum

Availability Groups with SQL Server Clustering

Availability groups emerged as our best choice while reviewing different options in previous sections. It is based on Windows Server Failover Clustering (WSFC) cluster and we may add SQL Server clusters to make sure the SQL Services availability in case of any outage or disaster. It would be a second layer of failover. AlwaysOn Availability Groups does not depend on any form of shared storage. However, if you use a SQL Server failover cluster instance (FCI) to host one or more availability replicas, each of those FCIs will require shared storage as per standard SQL Server failover cluster instance installation.

Always-On Availability Groups

It helps ensure availability of application databases, and they enable zero data loss through log-based data movement for data protection without shared disks.
Availability groups provide an integrated set of options including automatic and manual failover of a logical group of databases, support for up to four secondary replicas, fast application failover, and automatic page repair.

Always-On Failover Cluster Instances (FCIs)

Enhance the SQL Server failover clustering feature and support multisite clustering across subnets, which enables cross-data-center failover of SQL Server instances. Faster and more predictable instance failover is another key benefit that enables faster application recovery.

  • Dec 19 / 2013
  • 0
DBA Interview questions, dbDigger, Disaster Recovery, High Availability

High Availability VS Disaster Recovery

Most of the times both these terms are used interchangeably however we should have a clear distinction of these in mind. Particularly while dealing with the disaster recovery (DR) and high availability (HA) scenarios.

One word difference
Major difference lies in one word i.e. data. HA does not involve data recovery aspects while DR do as name indicates.

High availability (HA)

  • It is the system’s ability to remain accessible in the event of a system component failure
  • It involves multiple levels of fault tolerance and/or load balancing capabilities into a system
  • It does not include the redundancy or data recovery abilities
  • Service downtime does not exist or is in seconds

Disaster Recovery (DR)

  • It is the process by which a system is restored to a previous acceptable state, after a natural or man-made disaster
  • DR involves recovery of data and services within some down time
  • Downtime depends upon the option used along with the HA involved
  • For DR, measurable characteristics, such as Recovery Time Objective (RTO) and Recovery Point Objective (RPO) drive our DR plan
  • Dec 13 / 2012
  • 0
dbDigger, Disaster Recovery, Maintenance plans, SQL Server Agent scheduled Jobs, SQL Server Error messages, Transaction log files

Log backups fail after changing the SIMPLE recovery model

Recently a scenario was shared with me where maintenance plan was failed to create the log backups. Actually recovery model of DB was set to SIMPLE to prevent the log file population for some log intensive bulk operations. After completing the operations recovery model was put back to FULL. Every thing was OK till this point but maintenance plan job began to failed later for creating log backups of that database with following error

BACKUP LOG cannot be performed because there is no current database backup.

The reason for error is that after changing the recovery model of DB from SIMPLE to BULK LOGGED or FULL, we have to create a full or differential backup before going for log backup.
Solution to avoid such error is simple that we have to go in following sequence

  • Change recovery model to simple
  • Complete the required operations
  • Change back to FULL or BULK LOGGED recovery model
  • Create FULL or DIFFERENTIAL backup
  • Successfully create log backups
  • Here is a quick demo to simulate the above steps

    USE [master]  
     GO  
     -- Set the DB to simple recovery model  
     ALTER DATABASE [DBDigger]   
     SET RECOVERY SIMPLE  
     GO   
     -- Set the DB to full recovery model  
     ALTER DATABASE [DBDigger]   
     SET RECOVERY FULL  
     GO   
     -- Try to Backup the log, it would fail with error  
     Backup LOG DBDIgger   
     to Disk = 'C:DBD-LogBackup.Log'  
     GO  
     -- Now Create full or differential backup of database  
     Backup DATABASE DBDIgger   
     to Disk = 'C:DBD-FullBackup.bak'   
     GO  
     -- Try to Backup the log, it would be OK  
     Backup LOG DBDIgger   
     to Disk = 'C:DBD-LogBackup.Log'  
     GO  
    
    • Nov 17 / 2012
    • 0
    Backup and Restore, dbDigger, Disaster Recovery, SQL Server Training

    Implementing retention span for SQL Server backup device

    SQL Server uses logical devices or media to perform database backups. Backups may be append or overwrite to existing backups in these media sets or devices. It may be a serious issue when a backup in device is silently overwritten without your notice. This scenario may occur if you repeat a previous backup device name with INIT option. Consider the following example for this scenario

    -- Create a backup  
     BACKUP DATABASE DBDIGGER   
     TO DISK = 'E:DBD-BackupDemo.bak'  
     GO  
     /*   
     Create another backup with same device name.  
     By default backup would be performed by using  
     WITH NOINIT option and would be appended  
     */  
     BACKUP DATABASE DBDIGGER   
     TO DISK = 'E:DBD-BackupDemo.bak'  
     GO  
     -- Verify the two backups on backup device  
     RESTORE HEADERONLY FROM DISK = 'E:DBD-BackupDemo.bak'  
     GO  
    

    Here we are shown two backups on the device. It shows that second backup was also appended to the device. Going further we would issue another backup command by using WITH INIT. This time backup would overwrite the existing backups on the device.

     /*   
     Create another backup with same device name and INIT option.  
     It would overwrite the existing backups on device.  
     */  
     BACKUP DATABASE DBDIGGER   
     TO DISK = 'E:DBD-BackupDemo.bak'  
     WITH INIT  
     GO  
    

    Now verify that only last backup exists on the device and all previous backups were overwritten.

    /*   
     Verify that previous backups are overwritten   
     leaving behind the last backup  
     */  
     RESTORE HEADERONLY FROM DISK = 'E:DBD-BackupDemo.bak'  
     GO  
    

    RETAINING THE BACKUPS ON THE DEVICE

    At this point we require some way to ensure that backups on the device would not be overwritten even by accidentally using the WITH INIT option. There may be two approaches to make sure the retention of backups on the device for a specified time.
    • Use WITH EXPIREDATE or WITH RETAINDAYS
    • Set retention days server wide for all databases

    Following is a brief description of these two options

    WITH EXPIREDATE | WITH RETAINDAYS days

    This option operates along with backup command for a specific device of a database. We may specify a date or days along with WITH EXPIREDATE or WITH RETAINDAYS option respectively.  Here we would create a backup device with retention span of two days

    -- Create a backup with retention span 2 days  
     BACKUP DATABASE DBDIGGER   
     TO DISK = 'E:DBD-BackupDemo2.bak'  
     WITH RETAINDAYS = 2  
     GO  
    

    Now create a backup in same device along with using WITH INIT option. This time error would be generated because backups in the device are retained for two days.

     /*   
     Create another backup with same device name and INIT option.  
     It would overwrite the existing backups on device.  
     */  
     BACKUP DATABASE DBDIGGER   
     TO DISK = 'E:DBD-BackupDemo2.bak'  
     WITH INIT  
     GO  
    

    If next backups are created in the device without any retention parameter even then device would maintain the backups with retention parameter and could not be overwritten for next two days. We may get the expiration date or retention span of files by using RESTORE HEADERONLY.
    Similarly EXPIREDATE option may be used to specify an expiration date to which backups on the device would be retained. Date provided should be according to system configured date format.

    SET RETENTION DAYS AT SERVER LEVEL

    We  have another option to include the fixed retention days for every backup device of databases on that very server. For using this method

    • Right click on instance
    • Go to properties
    • Go to database settings in the left panel
    • Set default media retention span in days here. By default it is zero

    The retention days span here would automatically be implemented for every backup device of that server without explicit use of EXPIREDATE or RETAINDAYS.

    Note:

    Although WITH INIT would not be able to overwrite the retained media but WITH FORMAT may overwrite the header and content of device.

    • Dec 27 / 2011
    • 0
    Backup and Restore, DBA Interview questions, dbDigger, Disaster Recovery

    WITH Format option for mirrored backup devices

    WITH clause when used in backup statement has various associated options. I have discussed effect of couple of these associated options

    • INIT/NOINIT
    • FORMAT/NOFORMAT

    In most of the cases both of above options do not make major difference when used while creating backups on disk rather than tape drive. However i got error when using these options for mirror backups. For mirror backups INIT was not producing same results as that of FORMAT. Consider following scenarios
    Try to create full backup of AdventureWorks along with a mirror backup. In following script INIT option is provided

     
    BACKUP DATABASE AdventureWorks  
     TO DISK = 'C:AdventureWorks.bak'  
     MIRROR TO DISK = 'C:AdventureWorks1.bak'  
     WITH INIT  
     GO  
    

    Previously backup device is not present but mirror backup script would produce following error even with or without INIT option.

    Msg 3215, Level 16, State 1, Line 1
    Use WITH FORMAT to create a new mirrored backup set.
    Msg 3013, Level 16, State 1, Line 1
    BACKUP DATABASE is terminating abnormally.

    By providing WITH FORMAT argument as suggested in error message, backup along with mirror backup would be created successfully

    WITH Format option for mirrored backup devices

    Once mirrored media set is created you may append later backups to same mirrored media set without using WITH FORMAT. In case of adding new mirror to existing media WITH FORMAT would be required again.

    • Mar 24 / 2011
    • 1
    dbDigger, Disaster Recovery, Enhancement in new versions, Publications of Atif Shehzad on MSSQLTips.com

    Emergency State for recovery of Corrupt SQL Server Database

    A database in suspect state restricts access to data or settings. Data in suspect databases can not be pulled out nor you can perform recovery operations. In such cases, no matter the problem is due to any corrupt data page or log file, recovery is a night mare. SQL Server 2005 has introduced a very useful state for working with suspect databases. You can set the suspect database to emergency state and then exploit the wide range of operations on data and settings.

    Click here to read full article with practical examples about using the Emergency State for a Corrupt SQL Server Database.
    Pages:12
    Consult us to explore the Databases. Contact us