RealTime IT News

Firms Face Daunting Task Of Disaster Recovery

Tuesday was supposed to be G. Mark Hardy's first day as a senior manager with Ernst & Young in New York City. He had just come from Washington, D.C., where he had been managing director of information security services company Guardent Inc. But instead of settling into his new office that morning, Hardy, an officer in the Army Reserve, found himself in a command center coordinating reservists in rescue and recovery operations at the World Trade Center.

Weary and trying to squeeze in a short minute break, Hardy spoke to InternetNews.com about how the businesses once housed in the Twin Towers -- many of which are still attempting to account for their employees -- will overcome the daunting task of recovering from the events of Sept. 11.

"The organizations that will survive this disaster are those organizations with the most sophisticated real-time remote backup capabilities," he said.

Redundancy, Redundancy and More Redundancy
That sentiment was echoed by other experts, most of whom joined in some variant of the refrain, "redundancy, redundancy, and more redundancy."

Lee Clarke, associate professor of Sociology at Rutgers University and an expert in organizations, technology and disasters added that redundancy must be "meaningful," noting that a number of organizations in the World Trade Center had their disaster facilities in one of the other towers or buildings that were part of the complex. "That's pretty much like keeping your back-up tapes on the same property," he said. "You need multiple backups in multiple places."

Sunil Misra, managing director of the Unisys Worldwide eSecurity and Privacy Practice, agreed. "One of the things that has been recently demonstrated is that proximity can be an issue," Misra said. "You can have backup systems and backup recovery plans, but if they're in the same location or close by, then that doesn't solve the problem."

So how should businesses go about planning for the unforeseen? Recognizing that a plan is needed to maintain mission-critical operations is the first step.

"If a company intends to do something new instead of exercise a previously defined option, they show a lack of preparation," Hardy said. "Effective disaster recovery planning begins outside the existence of a specific attack."

Col. Marc Enger, executive vice president of security operations for Texas-based Digital Defense Inc. and formerly director of operations for the U.S. Air Force's Air Intelligence Agency (AIA), noted, "First you identify the critical operations that you have. Then you look at that critical operations set and say, 'what do I need for the next 30, 60, 90 days?' Then you build off-site facilities."

That process includes mirroring data to off-site storage locations, often through other nodes held by the company or through agreements and alliances with other businesses to use their nodes, Enger said.

Broad-based Participation
But companies often make mistakes when it comes to identifying critical operations, according to Hardy.

"It's very easy to protect the wrong thing," he said. Hardy -- and the other experts -- recommended broad-based participation in the planning process from most or all departments, with senior executive management approval on the determination of which assets are most critical. Otherwise, businesses run the risk of having decision-makers that don't fully understand the minutiae of what each department needs to function.

But Hardy said that doesn't mean firms should try to create disaster preparedness and disaster recovery plans themselves. "I think it would be wise for them to seek the advice of experts in this area," he said. "Trying to roll your own disaster recovery plan is a little like trying to pack your own parachute. Unless you really know what you're doing it's not a very good idea."

He added, "Exercise the plan regularly until you're absolutely certain that your people can respond to an emergency without actually having to stop to think about it."

Misra said that when Unisys helps clients prepare their plans, one of the company's focuses is location of and access to the data center. He said it is important to protect the computer system from outside access -- using proper locks, doors and other physical barriers -- and to look at elements relating to fire. Certain locations may require extra care like flexible jelly floors and fire retardants. Enger added that natural disasters -- from floods to earthquakes -- need to be taken into account.

As an example, Exodus Communications Inc., one of the largest managed hosting providers in the U.S. with 44 Internet Data Centers (IDCs) worldwide, detailed some of its security measures Tuesday.

"Exodus IDC facilities are constructed for security including seismic stability, and the company maintains extensive physical controls including manned entry points, alarm systems, biometric identification, video surveillance monitoring, on-site 24-hour certified security personnel, fire suppression and environmental controls, and identification procedures for entry into each of its facilities," the company said. It added, "In addition, Exodus IDCs are equipped with a number of redundant subsystems, such as multiple fiber trunks coming into each IDC from multiple sources, fully redundant power on the premises, and multiple backup generators to deliver the highest levels of reliability."

The Human Factor
But Clarke noted grimly that a large factor in Tuesday's events, and one that is often overlooked and the most difficult to address, is loss of personnel.

"Morgan Stanley lost maybe 20 percent of their workforce in there," Clarke said, referring to the financial firm that had the largest presence in the World Trade Center (The firm has since reported that most of its employees escaped). "Is it going to ruin their organization? No. But an organization that had only 50 people, 100 people, what can they do to recover? How do the others pick up the slack? Unfortunately, the answer is more organization -- IT especially. A huge part of the IT world is small- and medium-sized organizations. In principle, the answer is redundant procedures, redundant organization. That's an obvious solution to use when possible. Lots of companies are already moving or have moved to that to some degree, but it's never going to be a full solution."

So far, the clearest examples -- though certainly not the only ones -- of companies that were prepared to deal with this week's events were telecommunications carriers. New York City Mayor Rudolph Giuliani, on Tuesday, was one of the first to praise the "heroic" work of Verizon Communications employees (Verizon owns most of the area's local telephone infrastructure) in striving to keep communications up and running.

In the wake of Tuesday's events, communications infrastructure became an early concern as telephone service became intermittent at best. Indeed, New York City Police efforts were hampered to a degree Tuesday because of trouble receiving phone calls at headquarters. But Misra noted that telecommunications carriers acted rapidly and with a high degree of organization to solve those problems.

"By and large, I believe our communications infrastructure held up pretty well," he said. "We have people who were in the World Trade Center and around the World Trade Center and in the end we were able to contact everybody."