Tape is dead! Long live the tape! - Part I
by Josefine.Fouarge, on Apr 13, 2015 11:46:39 AM
It feels like there is a movement in some IT circles that say backup to tape is obsolete since ‘the cloud’ found its way into the datacenters of the world. Even in the times of private cloud services, SAN snapshots, SSD’s, and cheap public cloud storage we should take our hat off to the old guard that is tape. For decades IT professionals have been committed to back up their business critical data to these nearly indestructible small devices, and still a majority hold out against the looming invader that is ‘the cloud’. Life is not easy for our small efficient champion. All the rumors proclaimed from disk and cloud vendors prevent the tape from being the true hero that it is. The reliable tape just wants to help us keep our data safe and sound for as long as we want it.
That’s why I had the feeling to clean up with some of the most popular touted myths about tape:
1) Tape is dead!
Whoops, who said that? A survey by InformationWeek from 2014 states that 59% of all backups are still going to tape and 54% of the respondents still want to back up to tape in the next years. Among our own NovaStor DataCenter customers, approximately 70% are still using tape devices like single tape or tape libraries to back up their data for retention purposes in a remote location (bank safe or similar).
Thanks to several NSA leaks, thousands of companies are revising their privacy and retention policies, along with where they are storing their data. What is more safe and secure from NSA’s prying eyes than a tape locked away in a safe where you have the only key, or having all your data held in a public cloud for long term data retention?
2) Tape is slow and it doesn’t have enough capacity for my backups
Just looking at the raw numbers, a single LTO-6 drive for example can write uncompressed data up to 160MB/s that is not even taking into account hardware compression. While disk can vary widely typically between 25MB/s and 150MB/s, or depending on how much money you throw at spinning disk it can be much higher.
First of all, disks have several speed categories (read, write …). Second, it depends on the amount of streams you are using to push the data on the specific storage device, and third the compression (e.g. de-duplication to disk) allows you to write data faster, well it writes less data. You have to take into consideration that disks have a lot more moving parts than tape and more opportunities to have unreliable speeds. It also depends on the type of disk, the connection, and how things will be written to the disk, what RAID level you are using, and what else is happening on that disk. There are so many factors you have to include when trying to figure out what type of speed you will be getting to disk. There is a reason why there are entire software bundles to try to give you a guess as to what you will get in real life usage. Tape drives on the other hand have a set speed rate that one can expect from each drive. Adding a new drive increases the potential throughput proportionately.
Much of the push back from tape often comes down to how a restore from disk can be faster when you want to restore a single file or just a set of files. The access is easier for the disk medium then for a tape drive. The library has to load the tape (worst case several tapes), search for the file while spooling back and forth on the tape. That means the time for loading these tapes adds up to the whole restore time. However, restoring a complete system can be faster from tape. A tape has everything in order (one file after the other wound up on the magnetic tape), a disk holds the information distributed. As disks manages data in a different way, the controller has to search for every part of the file in different places. Adding de-duplication reduces the restore speed from disk tremendously. This is caused by the overhead of the necessary hydration of the de-duplicated data during the restore.
The tape library is able to use multiple streams to communicate with the backup server. Splitting the data up to 4, 8 or even up to 128 streams allows the library to utilize its full speed potential, because the data is transferred at a constant data rate. Using just one stream instead is a waste of energy for the library.
LTO-6 can handle up to 2,5TB uncompressed on one tape. That means in an average tape library with 24 slots you are able to handle 24 times 2,5TB of potential backup storage. For those who don’t want to start their calculators, that’s 60TB, without even taking into consideration the hardware compression of the LTO drive. To store that much data on disk, you need a larger RAID/ SAN which is then a lot more expensive and complex to handle (see 3 and 5).
3) Tape is expensive
Just a quick calculation. Let us assume we are talking about a heterogeneous environment with 4 physical servers (1 Active directory controller which is also the backup server, 1 Exchange server, 2 VMware hosts) and 15VMs. Inside the VMs are several SQL databases, some Linux systems, and all together a decent amount of data, let’s say 5TB. Backup scheme is 1 monthly full with a 90 day retention, 1 weekly full with a 30 day retention, and 2 weeks’ worth of differential/ incremental backups with a 2 week retention. This is what we see in our installations as a common small business environment.
With that backup scheme I need 3 full backups for the monthly retention, so 15TB. 3 full backups for the weekly full backup (the last week of the month is used for the monthly full backup), so another 15TB and 2 weeks’ worth of differential/incremental backups so maybe 1TB (no major changes). So that means I need a total amount of 31TB plus a few TB to account for growth.
Let’s say we go with a simple disk based storage device to hold these backups. To make things simple let’s assume you have all the networking and other things needed to do iSCSI and that your preferred vendor of choice is Dell.
... stay tuned for Part II of 'Tape is dead! Long live the Tape!'. Part II will continue to discuss the costs for disk and tape devices and analyze the reliability and stability of tapes and disk storage.
If you don't want to miss Part II and if you haven't already, sign up for our email to receive information about the technology behind NovaBACKUP DataCenter, NovaStor's technology partners, Webinar invitations, and general network backup and restore knowledge.
More information about NovaBACKUP DataCenter here.