Mar 25, 2016
[Podcast] How To Handle A ‘Worst Case Scenario’ DR Situation
Here at Datto, we love to hear about our partners’ disaster recovery scenarios. Why, you may ask? Because they are always a Success Story.
In the case of Chad Kempt, founder of Fast Computers Inc., he describes one of his client’s server failures as a “worst case scenario” DR situation. Not only did two discs in the server fail, but more and more continued to fail as time went on.
This particular business could not operate during downtime. Sure, they could sit at their desks and work, but no billing could get done because they were not able to create checks, and did not even know the amounts to bill their clients. A true disaster!
Please give us a quick introduction of yourself and a little bit about Fast Computers.
“We used to do a lot of what I call blue collar IT, which is warehouses and manufacturing, places where basically you show up and you need steel-toed boots instead of suits and ties,” Chad says. “We really don’t focus on one target vertical. We have a target client type in mind we like to work with and that is anybody that fits under a professional services umbrella. People who have desktops they sit at, servers they rely on, technology that runs their business, office type environments. We’ve grown from what we were doing in the early 2000s to where we are now and we’re actually looking at getting a new office space, possibly this year, as we’ve kind of outgrown our current one, unless somebody can tell me where they sell double-bunk desks, where I can stack up my staff. We’re out of floor space. Early on, we would sell what I refer to now as the flavor of the week. That is any product or service that seemed like the best deal at the time or the new amazing thing to use. This didn’t seem as terrible at the time as it does now, because it was just me dealing with it and I’ve always kept up on tech, personally outside of work.”
Can you walk us through the early days of Fast Computers when you first took a look at Datto and tell us what value you saw in it? What was the difference between Datto and other vendors you worked with at the time?
Chad says, “Well, I can’t really get to that without mentioning what we were doing before and what brought us to that. What we were originally doing before was we were doing a build-your-own-nightmare solution and what we were doing with that was we were standardized on the equipment we were using and the software we were using and everything. The problem was it wasn’t very consistent. Prices would be changing all over the place. It was difficult to lock down consistent costs. The interface wasn’t very junior tech friendly. Even for things like flower stores, we were involving more senior techs. All these sorts of things added up to culminate into giving Datto a look. At first glance, it seems like it’s a lot more expensive than what we were dealing with, but the problem is, we weren’t measuring it correctly. What I mean by that is, our time has a value and especially when you can’t utilize certain resources effectively, there’s a cost to that. Actually, it turns out that Datto was actually quite cheap in comparison, because we weren’t, inexpensive if you will, because we weren’t dealing with that.”
Let’s talk about the actual recovery scenario.
“Actually, yeah,” Chad says, “Let me just take you right through it. It really was a worst case scenario for us and for the client. We had started back the first day after the New Year holidays and there was an alert that there was a degrading status on the array. This was an inherited server, so it wasn’t one we deployed and so there were some odd things about it we didn’t like, but we were living with. The customer had been really oversold just before we came on board and so, although we did replace the backups obviously and the firewalls, we left that sort of stuff in place. What ended up happening was the server had an audible beep on it and we couldn’t turn off the audible beep without rebooting the server and going into the RAID bias, again because of the configuration of the server, and I asked the client if he could hold off until outside of hours, because it really was a bad idea to do this during business hours. He said that he could not wait. He would not remain sane until the end of the day. We sent in a tech and when he rebooted to turn off the audible alert, it reported a multiple-drive failure. I think, at that point, it was two drives that had failed. It was RAID six, so it should have been able to handle a two-drive failure, but I guess a third drive actually failed during the process. We basically were down. The tech, he tried to ... To say he tried to get the array back online is wishful thinking is what he was doing. He probably spent two minutes doing some wishful thinking and when that was failing out, we had him move forward. You have to understand that this was ... I wouldn’t say the middle of the first day back, but I think it was around 11 a.m. or something like that, they have a main office where they have their exchange server and their terminal server, remote desktop server. Their second office relies on that office because their accounting software is there and without it up, they can’t check customer balances. They can’t print checks. They can’t make invoices. They basically can’t do anything. They can do work, but they can’t bill anybody and they have no idea what anybody owes so this was horrible. That’s when we decided to abandon the old server and back up to Datto. We booted up the two VMs on the Datto, logged in and changed the IP addresses and he was back in business. We took the old server back to our office and, for anyone listening that’s interested, when we were wiping the disks, testing them, five out of six of the drives failed during that time. Pretty much all the drives were toast. I should also note the customer was doing an old version of VMWare but, on the new server, we wanted to go to hyperV for various reasons, doesn’t really matter. With the Datto backups, we were able to just copy the VHDs over to the new server after hours and boot them up. No issue. No conversion software. No extra work. No downtime. It doesn’t get any better than that.”