Tuesday, January 28, 2014

Boom - How Do You Protect Data if the Datacenter Ceases to Exist?


Sometimes corporate videos can be a bit on the cheesy side but I really like the one I'm posting above as it allows me to explain to my family one of the main things the product I work on, VPLEX, does. It also gets some extra-credit as early in my career with EMC I had the opportunity to work with Steve Todd, one of the people in the video.

One of the ultimate problems in protecting data is what to do if the datacenter actually ceases to exist. In my career I've seen this happen for many reasons. On the low-end, there is the problem of power outages or maintenance windows which temporarily take a datacenter offline - the datacenter ceases to exist for a finite amount of time. On the more extreme end the datacenter can cease to exist permanently. Extreme weather, often accompanied by flooding, is one culprit that takes out datacenters. Similarly bolts of lightning have destroyed datacenters. Though the loss of data pales next to the loss of life, the September 11 terrorist attacks did illustrate how deliberate acts of destruction could also affect data. And as I mentioned in my first post, loss of data can in some cases lead to loss of life. While researching this topc, I found an article at PC World quoting Leib Lurie, CEO of One Call Now:
The 9/11 attacks "geared people toward a completely different way of thinking," Lurie said. "Everyone has always had backup and colocation and back-up plans, every large company has. After 9/11 and [Hurricane] Katrina and the string of other things, even a three-person law firm, a three-person insurance agency, a doctor with his files, if your building gets wiped out and you have six decades of files, not only is your business gone, not only is your credibility gone, but you're putting hundreds of lives at risk." 
The loss of a doctor's records could be fatal in some cases, and with the loss of a law firm's records, "you could have people tied in knots legally until you find alternative records, if you find them," Lurie said.

There's various options that a storage administrator can take to protect a datacenter. At the very least there would need to be a backup stored off-site, with backup methods ranging from periodic snapshots to continuous data replication. And, amazingly enough,  EMC provides solutions for both these options, with Mozy and RecoverPoint. (Hey, I warned you that while I'm not writing on behalf of my employer I'm still a fan.). Mozy is geared more for the PC and Mac environment, taking periodic snapshots and backing them up to the Mozy-provided cloud whereas RecoverPoint  uses journaling to keep track of every single write, allowing for very precise rollbacks.

These options allow for disaster recovery. If the disaster occurs you will experience an outage, albeit one you can recover from. As my family's IT manager I find that suits our needs very well - when my wife replaced her laptop we simply told Mozy to restore to a new laptop. I myself tend to be more in the cloud full-time and use Google Drive as my main storage, giving me replication and the ability to recover.

However, while recovering from an outage without loss of data (or minimal loss with some solutions) is fantastic, many enterprise solutions need continuous availability. Recovery from a disaster is not sufficient. I know I would have been rather annoyed if my bank told me my information was not available while they recovered from backup in the aftermath of one of the many storms that have hit us here in Massachusetts over the past several years. That's where having a solution which allows for continuous availability even in the event of the destruction of a datacenter, becomes essential. That's one of the things that my product, VPLEX does - you are able to mirror writes to two sites separated by substantial distances. And just as importantly, it is possible to perform read the same data from either datacenter. If one datacenter ceases to exist, the other one is still up and is able to continue operating (as the rather dramatic video at the beginning of this post illustrates). And for even more protection, many customers combine the RecoverPoint and VPLEX products, allowing for both rollback and continuous data protection.

All of this comes at a cost, making users balance their availability needs vs. their budget. Not all applications need continuous availability. But those that do need to be able to endure a wide variety of potential problems, ranging from maintenance windows all the way to the datacenter destruction described here. And providing these solutions presents challenges a vendor must address. For example, some of the most obvious include:

  • What does an availability solution tell a host a data write has completed? This question becomes more and more critical as the latency between sites increases.
  • What does an availability solution do if a remote site cannot be seen? How does it determine if the remote site has had an outage or if the communications link has been severed?
  • How is re-synchronization handled when two or more separated sites are brought back together?


No comments:

Post a Comment