Everything you need to know about backup and disaster recovery (BDR)
At some point, all computer hardware fails. It’s a fact of life. Whether it’s from age or accident, data loss is inevitable when hardware fails, and it can cripple an SMB in a second and destroy profits just as easily. This is why an effective business continuity plan and Backup & Disaster Recovery (BDR) solution are essential to virtually every business today.
There’s no way to predict the future; hard drives malfunction unpredictably, cyber-attacks are on the rise, and natural disasters may strike with little to no warning. To ensure data is secure, it needs to be backed up and quickly recoverable so downtime is minimal when the unforeseeable happens.
Backup and Data Recovery has come a long way. Long gone are the days of media-vaulted backup and manual recovery methods. Today’s BDR solutions provide secure, fast, monitored, and continuous backup and rapid data restoration through cloud-based architecture. A variety of methods and options are available in the marketplace to cater to any business need.
The evolution of backup
Data backup software began with tapes being copied in a machine and stored in a physical vault, typically offsite – and that process didn’t change much for decades. On-site backup solutions are nearly as old as computing itself; tried, trusted, and true. Whether it’s a database that needs backing up, unstructured files, applications, or anything in between—there’s a backup solution out there that can get the job done.
The resultant backup may go to tape, or optical media, or to disk, but the result is the same—a collection of backup media that gets put in a vault. To keep that media safe, on-site backup has always had an off-site component, which has mostly consisted of somebody taking the media that was backed-up and moving it elsewhere. While this is often called off-site backup, it’s more properly “off-site backup media vaulting”.
During the past two decades, this vaulting form of off-site backup meant either moving media, or more progressively, creating a storage repository at a remote site along the WAN and tunneling the backup data over what were almost always lower-bandwidth WAN links after the backup was complete. This either limited the time frame in which a backup window could occur or it limited the bandwidth available to business users, and storage costs always soared in an era where storage was still prohibitively expensive to anyone but the large enterprises. A decade ago, the evolution of backup brought a new generation of off-site backup. With online backup, a solution provider backs up data to an off-site, hosted platform, obviating the need for media transport.
That same evolution has merged backup with cloud computing, taking that previous generation into this generation. Over time, bandwidth has increased, and it has become possible to use third-party services to handle online off-site backup. Hardware has simultaneously become abstracted through virtualization. This combination of increased bandwidth and commoditized hardware, coupled with the natural evolution of business continuity software, has enabled off-site backup and disaster recovery solutions to offer continuous data protection with the same level of redundancy that was once the purview of expensive systems like those from the former Tandem company.
Neither off-site nor on-site backup are enough in the event a disaster strikes, and that has been the big driver behind cloud-based off-site backup. If a disaster happens, the data on backup media (off-site or on-site) will not be enough to fully recover. Replacement computer systems to restore the data will be needed, as will the networks that connect the systems.
Key terms & definitions
Backup and disaster recovery (BDR) is a combination of data backup and disaster recovery solutions that work cohesively to ensure a company’s business continuity.
Remote data backup is the process of backing up data created by remote and branch offices (ROBOs) and storing it securely. Businesses with ROBOs require backup and recovery solutions that can support the company’s data protection policies and business service levels.
Backup window is the timeframe within which backups are scheduled to run on a given system. These are often scheduled during times of minimal usage (i.e. after hours).
Recovery time objective is a benchmark indicating how quickly data must be recovered to ensure business continuity following a disaster or unplanned downtime.
Recovery point objective is a benchmark indicating which data must be recovered in order for normal business operations to resume following a disaster or unplanned downtime. This is often based on file age (i.e. all data backed up before date X must be recovered), and in conjunction with RTO can help administrators determine how frequently backups should execute.
Disaster recovery is the area of security planning that deals with protecting an organization from the effects of significant negative events. Significant negative events, in this context, can include anything that puts an organization’s operations at risk: crippling cyber-attacks and equipment failures, for example, as well as hurricanes, earthquakes and other natural disasters.
Cloud disaster recovery is a component of a disaster recovery plan that involves maintaining copies of enterprise data in a cloud storage environment as a security measure.
Business continuity encompasses a loosely defined set of planning, preparatory, and related activities that are intended to ensure that an organization's critical business functions will either continue to operate despite serious incidents or disasters that might otherwise have interrupted them, or will be recovered to an operational state within a reasonably short period.
Types of backup
Full backup is a method where all the files and folders selected (or even an entire machine) will be backed up in their entirety. It’s commonly used as a first backup and is then followed up with subsequent incremental or differential backups.
When you perform a full backup, it’ll contain a complete backup of all selected data. When the next backup is scheduled to run, the entire list of files and folders will be copied again (regardless of whether or not any changes have been made). This simplifies the restore process, as the complete dataset lives in each and every backup task, but it also consumes the most storage space and can cause backups to take quite a bit of time to complete.
Differential backup is a process that begins with one full backup, and then subsequently backs up all changes that have been made since the previous full backup. This allows for much faster backups (but slower restores), and makes more efficient use of storage capacity.
Incremental backup is largely the same as differential backup, with one important difference - after the initial full backup, subsequent backups store changes that have been made since the previous backup cycle whether it was a full or an incremental cycle.
Mirror backup is, as the name suggests, a real-time duplicate of the source being backed up. With mirror backups, when a file in the source is deleted, that file is eventually also deleted in the mirror backup. Because of this, mirror backups should be used with caution as a file that is deleted by accident, sabotage or through a virus may also cause that same file in mirror to be deleted. Some do not consider a mirror to be a backup.
Many online backup services offer a mirror backup with a 30-day delete. This means that when you delete a file on your source, that file is kept on the storage server for at least 30 days before it is eventually deleted. This helps strike a balance offering a level of safety while not allowing the backups to keep growing since online storage can be relatively expensive.
Many backup software utilities do provide support for mirror backups.
- The backup is clean and does not contain old and obsolete files
- There is a chance that files in the source deleted accidentally, by sabotage or through a virus may also be deleted from the backup mirror.
Local backup is any backup where the storage medium is kept on-site. Typically, storage is plugged in directly to the source computer being backed up, or is connected through a local area network to the source being backed up. This is the most basic form of backup and data-loss prevention, and contains several inherent disaster-related risks as there is no offsite redundancy or cloud component.
Cloud or remote backup is a type of offsite backup that allows users to access restore or administer backups either while located at the source location or an offsite location. Data here is backed up in the cloud (either directly, or via a local appliance); this type of backup provides some of the strongest available protection against natural disasters and unplanned downtime.
Typical cloud backup solutions, also known as online backup, focus on copying data files to a physically remote location, which is great for disaster recovery. Hybrid backup integrates cloud backup and local backup to deliver system recovery, rapid file restores, and disaster recovery.
Hybrid backup is the combination of both cloud backup and local backup, where local backup is typically a USB drive or network shared drive or NAS device. The ideal hybrid backup solution integrates these forms of backup in an automatic, user-friendly utility running transparently in the background. While local backups are typically sufficient for protecting the data and other information on a computer system, the cloud backup adds a level of assurance that offsite backup data is safe from disaster.
Hybrid cloud data recovery backs up each production server as a virtual machine image, either by making a copy of the current VM or by converting physical servers to VM images (a process referred to as physical to virtual, or P2V) as part of the backup process. The local appliance stores these images just like it does regular file backups but also provides a platform on which they can be restarted in case the primary server goes down.
In this way, a single appliance can act as a local standby server for multiple primary servers and VMs. The failover isn't automatic, but many hosted disaster recovery services can provide what's essentially high availability (HA) to the production server environments as part of their backup infrastructure. The final step is to move these VM images to the cloud provider's data center, which has enough compute resources to restart any of them in the event of a disaster at the client's site.
Data backup and disaster recovery are not the same. Backup software can fail, or the person responsible for backing up can fail. Also, backing up without recovery in mind is tantamount to not backing up at all. And finally, there are other steps you have to take in order to successfully restore your data in the event you need your backup. Steps like assembling the right recovery environment (the right operating systems and servers and storage) and the right people, processes, and tools to bring back that backed up data.
Why do businesses need BDR?
1. Backup software can fail
There are numerous examples where an unjustified faith in data backup software left an organization hanging after a disruption. Take the case of a Civil District Court in New Orleans. What seemed like a routine recovery of the county’s conveyance and mortgage records database after a server crash turned into a bigger headache than a night out in the French Quarter during Mardi Gras. Without conducting a full restoration test, what went undiscovered was that the installation of an upgraded version of backup software actually failed, despite an indication that the upgrade had been successful. And for nearly a year, new records that were thought to be backed up were not, all while old copies were purged every 30 days.
The end result: All new entries and changes that occurred after the most recent backup were lost, as well as all records dating back to the 1980s.
2. You have to back up with recovery in mind
Steven Covey states it best in 7 Habits of Highly Effective People: “Begin with the end in mind.” The same goes for data backup and disaster recovery. You have to back up your data as if you will one day need to get it back.
Here’s an example of why this is so critical.
One of our partners used a third-party company to manage their backups. That third party backed data up from different servers and multiple applications by striping it across tape. From a backup perspective, their main concern is not restoring data: it’s to back up data as quickly as possible. Along came a disaster which required the IT team to recover their data. When they began to bring data back from these tapes, they quickly realized that striping their data created a million-piece jigsaw puzzle that was nearly impossible to reconstruct. In the end, they couldn’t find all the tapes necessary to put together this “puzzle.”
3. Data backup is only the first chapter
Getting a secure copy of your data backed up at an offsite location is only chapter one of disaster recovery. Chapter two is having the right recovery systems connected to your data, meaning you need to have the right servers, storage, hypervisors, and operating systems in your recovery environment. Basically, your recovery environment needs to reflect your production environment. This is not an easy step as there are many changes that occur daily in a production environment that IT staffs are frequently too busy to capture.
Let’s say you DO have the right recovery environment; chapter three is having the right people, processes, and tools needed to recover at the time you need them. We see this problem all the time: the Oracle guy is not available, the Windows guy was not willing to travel, the runbooks were outdated or based on the older operating system, etc.
All of this is to say that data backup and disaster recovery are not the same, but both are necessary for long-term business technology resiliency. You must have the right recovery mindset, which means:
a) Backing up data according to your recovery strategy;
b) Connecting the right recovery systems to the properly backed-up data; and
c) Creating a programmatic approach to recovery by arming yourself with the right people, right processes, and right tools, and making sure they’re all available at the right time.