Steps Taken to Avoid Data Loss and Ensure Continuity

The home page of Junior Achievement of New York (JANY) states it “Teaches Young People About Money Management and How Business Works.” Well, JANY nearly learned the ultimate lesson in both when they experienced catastrophic system failure. And what’s worse? They and their Managed Service Provider (MSP) did all the right preparations!

Connect Computer of Fairfield, Conn. manages JANY’s systems. Connect is a full service MSP serving the needs of clients from small businesses to the enterprise. While Connect’s goal is to help businesses increase their productivity through the better use of technology, their number one goal is to keep their business operational without any interruptions.

A first order of business for Lynn Souza, President of Connect, and her team when they started working with JANY was to evaluate their existing backup, recovery and continuity solution. It was tape. Given the limited and unreliable nature of tape, Connect proposed a more robust backup, disaster recovery (BDR) solution to JANY; one that can also deliver Business Continuity and mitigate downtime should a disaster occur. They recommended Datto SIRIS. SIRIS goes a step beyond BDR, offering Business Continuity. Based on a hybrid cloud model, SIRIS combines image-based backup, instant local and off-site virtualization, Screenshot Backup VerificationTM and Inverse Chain TechnologyTM— all of which yield a superior Recovery Time Objective (RTO). RTO is basically the time a business can afford to be “down.” When this was presented to Christopher Malin, Vice President of Finance and Administration with JANY, he was immediately on board to implement the new SIRIS solution. Chris also notes that the solution cost is “very reasonable” and “fits in the NFP” budget.

“The only way the data would be OK and I’d keep my job was if the Datto worked, and it did.”
Jerry Truman,
Senior Systems Engineer
Connect Computers

To answer your question, yes, even a not-for-profit needs to be aware of its RTO. Consider the fact that Junior Achievement of New York has served a total of 67,850 students and raised a gross total of more than $3.5 million. That’s a lot of students to help, donations to implement, and data to manage. Should JANY experience data loss or downtime, proprietary student information would be compromised and future donations would be in jeopardy.

His building’s operations team alerted Chris at JANY that they would be cutting off power to the building, to upgrade the electrical. With plenty of time to prepare, Chris, along with Jeremy Truman, Senior Systems Engineer with Connect, did all the right things. Jeremy remotely shut down all of JANY’s servers. Chris went one step further and actually unplugged everything. This was on a Friday. What could possibly go wrong?

On Monday, Jeremy was on-site with Chris when the power was restored, to make sure both servers, with seven virtual machines, rebooted correctly. Things didn’t go according to plan. One of the two physical servers started, the other failed miserably, 4 of the 8 hard drives “fell out” of the array which affected all their data being a RAID 5. Turns out the drive configurations were corrupt, and Connect could not put the RAID arrays back together without problems, and potential data loss.

Jeremy and Connect’s first priority was to get JANY operational. With this, they spun up the domain controller, Exchange and Razor’s Edge (a database program) on the Datto SIRIS, locally. Less than two hours later they had three VM servers up and running locally off the Datto SIRIS and JANY systems were up and running. Jeremy noted that if JANY had still been using tape, “it would have been days, and then no guarantee the data was valid.” With SIRIS, the team had run a DR (Disaster Recovery) test a few months prior this server outage, and knew best practices to most efficiently manage the process of recovery. The Connect team stresses that no matter the solution you have in place, “it’s always important to perform DR tests”.

While JANY’s systems were running off the SIRIS device, the technicians had the time and opportunity to rebuild the actual server. This process took 24 hours to complete due to the size of the file exports. With the server repaired all systems were exported back to the EXSI server, started up, and everything was back to normal. “No employee knew anything had happened,” explained Chris. JANY did not miss a beat of business continuity. All things considered, a huge success: Datto SIRIS kept upwards of 40 users working, student activities running and donations coming in.

Outcome and ROI

The Datto SIRIS more than paid for itself from what started out as a simple server reboot that led to a significant crash and outage. The cost to rebuild the potentially lost data would have taken at least five days of work, not to mention the damaging headlines and loss of confidence in a very respected charity.