Home About Membership Publications Events CareerNet Sponsors

Search NaSPA

  Help

 

 

“It Happens to the Best of Them” NaSPA’s President Reflects on a Personal Data Recovery Disaster

By NaSPA President Leo A. Wrobel

After 30 years of writing articles on disaster recovery, one would think my office would be about the best prepared small office around. We thought we were. That was until Friday, February 12, 2010 when the Dallas / Ft Worth area received 14 inches – yes – 14 inches of snow. The previous record was 7 inches, and stood for over 30 years. Before that, who knows. It just never happens here.

Right around close of business on Friday the power went off. No problem, the office has a backup generator. We finished up the week and never missed a beat. But I digress. The reason for this novel article is not only to tell you about how Mother Nature put the whammy on us last week. It is also to illustrate first hand how data disasters happen to the best of us, even when you believe you have the bases covered. We THOUGHT our data was backed up. Well….

So what happened?
You know what happens when you assume. First, we assumed that since all of our data was backed up on a RAID server that it was safe and invincible. As most of your know, RAID is the acronym first defined by David A. Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987 to describe a redundant array of inexpensive disks. Generally speaking, RAID refers to any technology that allows computer users to achieve high levels of storage reliability by arranging hard drives into arrays for redundancy using special hardware or software. If a drive fails in a RAID system, the data is designed to remain accessible on another redundant drive. Therefore our first assumption was the blind trust that if something happened to a disk drive, the other redundant drive in our RAID would take over. Moreover, we assumed that if something happened to a power supply or logic board, we would simply pop one of the drives out and drop it into a spare computer to resurrect the data. Next we assumed surge protectors do what their name implies.  Most importantly, we underestimated the cost of resurrecting even relatively small quantities of data in a worst-case scenario and the financial impact it has, even on a relatively small office like ours.

The disaster began unfolding about 8 PM Saturday – not when the power went out but when it came back on. Up until that time we operated warm and happy for 28 hours on backup generator power. When we went back to commercial power, the electric company (as is often the case) was still doing its thing. The power flicked on and off two or three more times Saturday night. During one of those times there was a surge or spike in the commercial power.

First the good parts of our planning: Our phones stayed up. In the event that they didn’t stay up, we subscribed to the Telephone Recovery voice recovery system – a versatile and affordable service. If the phone company or our telephone system had failed, our calls would have been instantly redirected to our cell phones. We were well prepared on voice communications. Our Internet access stayed up too. We use two separate wireless Internet service providers (WISPS) and a dual WAN (wide area network) router. In fact, ONE of the two Internet feeds did fail, since their tower was affected by the same power outage. However, the other diverse link plugged right along so nobody was the wiser. IP communications is marvelous!

When the power came up however we didn’t realize that the spike killed the RAID server – and zapped four years of our data!

Ok, you may now lecture us on the importance of back up copies. But golly, isn’t that why you buy a RAID server in the first place? The RAID server by its nature IS a backup copy. Or so we ASSumed.

In hindsight, we obviously made some errors in judgement. First, we had unrealistic expectations on what constitutes a RAID “server.” Our “server” is actually a Fry’s Electronics (or shall I say FRIED electronics) retail store special. The actual nomenclature for the thing is an Iomega StorCenter ix2 Network attached file storage unit. It worked well for us. Now may it rest in peace.

Recovery however was not as simple as just buying another $400 server. It took two days when all was said and done:

  • Option A was to remove one of the two redundant SATA drives and slap it into one of the office computers. This action quickly exposed the first flaw in our carefully contrived backup plan. Iomega uses a proprietary file system. None of the office PCs could read it. That being said, we moved on to Option B:
  • Option B was to tough it out and call Iomega’s data recovery people. Mind you, we had six months worth of data backed up on other servers and stored off site – kind of a belt and suspenders approach given the fact that the RAID server had two redundant drives. In Iomega’s defense, their help line was answered on the third ring by a knowledgeable technician. When asked about the recovery procedure, he explained that we simply had to overnight ship the server to them. They would perform a “no cost” assessment of the problem and call back with a quote to fix the unit and or resurrect the data. If we didn’t want them to take any further action, we could simply ask for the unit back. It was then that I asked for a “range” of costs associated with such a task. The quote? Between $700 to $3000. Maybe more. The original Fry’s Electronics price tag on the box was $249.99. $3000 was obviously a different kind of proposition.
  • Lesson #1: The price to fix something is not based on the dollar value of the unit itself, it is based on the value YOU place on the data it contains. This is something I have written about for 30 years. It’s another thing altogether to live it first hand. As a small office, a $3000 hit was a big deal. Losing four years of archive data that we thought was safe was even worse to contemplate.
  • Riddle:& Why did the CPA cross the road? Answer: Because he looked in the file and that’s what he did last year. Many professions use the same electronic data again and again. We do the same thing so even “old” data is important to us.
  • Anyhow, being the tightwad that I am, I opted to first try an Option C. I searched eBay and the Factory Outlets for the same unit. I reasoned that I could find a good used unit, swap the drives, restore the data, and then punt both units out the back door. I even found one on Iomega’s web site for $137.00 but was still unsure whether it would be compatible with the proprietary file structure. Lesson #2: Proprietary solutions are always an issue when recovering from a disaster!
  • In the final analysis, I fixed the problem myself, using a combination of perseverance, elbow grease and blind luck. Here is how.

When we installed one of the affected drives in a Windows XP machine, (Plan A) it did not show up as a drive letter. It did show up, however, as an unknown drive when we brought up the Administrative Tools / Disk Management function under Windows and looked at the drive layout. Anyhow, having heard that many manufacturers of these units use the Linux file management system, I pulled out a CD of SLAX Linux purchased from an electronics flea market three years ago. Bingo, it not only found the Iomega drive but actually read the files too.  The trouble was, for whatever reason, it would not create a directory or write to a Microsoft Windows hard drive, perhaps since it was only an evaluation copy. Sensing I was close to a solution however, I pulled out a copy of Xandros Linux. Xandros’ claim to fame is as a bridge between Linux and Microsoft environments. That did the trick. Once Xandros was installed it was a simple matter to copy all of the files from the Linux partition to a Microsoft partition that could be read by the other computers in the office. Wow, that was a close one. Leo was the hero. Or was he? What did this minor disaster really cost?

Lesson #3 – Disasters Cost Money even when YOU recover yourself! First, Leo had to do the work himself. That meant pulling him off $250+ hr. assignments for clients for two full days. That may not sound like a lot of money but you also have to measure the customer confidence issues, and the ticked off clients who waited while Leo dinked around with data restoration.

My best guess is that we broke about even on what Iomega would have charged to do the job. I now have a new appreciation though for the indirect costs of executives in small and medium sized offices are working on IT problems instead of doing their jobs. It happens much more often than people imagine. We learned a valuable lesson, but we also got some great material for this article. (Even if it means eating some humble pie, and a collective laugh at Leo’s expense.) The lessons learned boil down to what we have said for 30 years.

  • Expect the unexpected and don’t assume anything.
  • The value of disaster recovery has little to do with the price of the equipment, but really with the value of the data it contains.
  • The value of the restoral process corresponds linearly with the hourly rate of the people called to restore it. For us, it was actually more expensive for me to do the recovery than to retain a more qualified I.T. technician.

As it turns out we survived, but only after two days of the proverbial wailing and gnashing of teeth. We hope this article strikes a resonant chord with a few of you as it did with us, and that you take a second look at even the small systems that support you and your organization. If things like this happen to us they can happen to anyone. Happy Planning!

PS – As NaSPA’s President I like to think that our proud 24 year old organization is one that can laugh – even at itself, as in this article. Besides, there might be a tip or two here that could even be useful! If you are not a NaSPA Member, why not consider joining right now? It’s one of the least expensive memberships available for the benefits you receive. Membership also opens the door to a vast NaSPA library of other articles and tips – even the humorous ones like this article. Please consider membership.  Click HERE for more details or to join immediately.

1

About the Author

NaSPA President Leo A. Wrobel has over 30 years of experience with a host of firms engaged in banking, manufacturing, telecom services and government. An active author and technical futurist, he has published 12 books and over 700 trade articles on a wide variety of technical subjects. Leo is presently CEO of Dallas-based and b4Ci. Inc. See http://www.b4Ci.com call (214) 888-1300 or at leo@b4ci.com.

 

 

Copyright 2008 NaSPA, Milwaukee, WI. All rights reserved.
7044 S. 13th St. Oak Creek, WI 53154 • (414) 908-4945
Privacy StatementLegal Disclaimer