|
The Bulletin
Technical Tips
By Tom Snyder Ph.D.
10 Steps to a Disaster Plan
Most businesses depend heavily on technology and automated systems, and their disruption for even
a few days could cause severe financial loss and threaten survival. With the lessons of Hurricane
Katrina still fresh, many Bay Area businesses are thinking about disaster planning or re-thinking the
scope of their plans. The following is a 10 step process for creating an information technology
disaster recovery plan.
A disaster recovery plan is a comprehensive statement of consistent actions to be taken before,
during and after a disaster. It involves more than off-site storage or backup processing. The plan
should be documented and tested to ensure the resumption of operations in the event of a disaster. The primary objective is to protect the organization in the event that all or part of its operations
and/or computer services are rendered unusable for more than a few days. The probability of a
disaster occurring in an organization is highly uncertain. A disaster plan, however, is similar to
liability insurance: it provides a certain level of comfort in knowing that if a major catastrophe
occurs, it will not result in financial disaster.
1. Obtain top management commitment
Top management must support and be involved in the development of the disaster recovery plan.
Management should also be responsible for coordinating the plan in the event of a disaster and
ensuring its effectiveness within the organization. If the organization receives information processing
from a service bureau, management must also:
• Evaluate the adequacy of contingency plans for its service bureau
• Ensure that its contingency plan is compatible with its service bureau’s plan
2. Establish a planning committee
A planning committee should be appointed to oversee the development and implementation of the
plan. The planning committee should include representatives from all functional areas of the
organization. Key committee members should include the operations manager and the data
processing manager. The committee also should define the scope of the plan.
3. Perform a risk assessment
The planning committee should prepare a risk analysis that includes a range of possible disasters.
Each functional area of the organization should be analyzed to determine the potential impact
associated with several disaster scenarios. The risk assessment process should also evaluate the safety
of critical documents and vital records. Traditionally, fire has posed the greatest threat to an
organization. Intentional human destruction, earthquake and terrorist attacks, however, should also
be considered. The plan should provide for the destruction of the main building. The planning
committee should also analyze the costs related to minimizing the potential exposures.
4. Establish priorities for processing and operations
The critical needs of each department within the organization should be carefully evaluated in such
areas as:
• Functional operations
• Key personnel
• Information
• Processing Systems
• Service
• Documentation
• Vital records
• Policies and procedures
Processing and operations should be analyzed to determine the maximum amount of time that the
department and organization can operate without each critical system. Critical needs are defined as
the necessary procedures and equipment required to continue operations should a department,
computer center, main facility or a combination of these be destroyed or become inaccessible.
5. Determine recovery strategies
The most practical alternatives for processing in case of a disaster should be researched and
evaluated. It is important to consider all aspects of the organization such as:
• Facilities
• Hardware
• Software
• Communications
• Data files
• Customer services
• User operations
• MIS
• End-user systems
• Other processing operations
Alternatives, dependent upon the evaluation of the computer function, may include:
• Hot sites
• Warm sites
• Cold sites
• Reciprocal agreements
• Two data centers
• Multiple computers
• Service centers
• Consortium arrangement
• Vendor supplied equipment
• Combinations of the above
Written agreements for the specific recovery alternatives selected should be prepared, including the
following special considerations:
• Contract duration
• Testing
• Costs
• Special security procedures
• Notification of systems changes
• Hours of operation
• Specific hardware and other equipment required for processing
• Personnel requirements
• Circumstances constituting an emergency
• Guarantee of compatibility
• Availability
• Priorities
6. Perform data collection
Recommended data gathering materials and documentation includes:
• Backup position listing
• Critical telephone numbers
• Communications Inventory
• Distribution register
• Documentation inventory
• Equipment inventory
• Forms inventory
• Insurance Policy inventory
• Main computer hardware inventory
• Master call list
• Master vendor list
• Microcomputer hardware and software inventory
• Notification checklist
• Off-site storage location inventory
• Software and data files backup/retention schedules
• Telephone inventory
• Temporary location specifications
It is extremely helpful to develop pre-formatted forms to facilitate the data gathering process.
7. Organize and document a written plan
An outline of the plan’s contents should be prepared to guide the development of the detailed
procedures. Top management should review and approve the proposed plan. The outline can
ultimately be used for the table of contents after final revision.
A standard format should be developed to facilitate the writing of detailed procedures and the
documentation of other information to be included in the plan. Standardization is especially
important if more than one person is involved in writing the procedures.
The plan should be thoroughly developed, including all detailed procedures to be used before, during
and after a disaster. It may not be practical to develop detailed procedures until backup alternatives have been defined. The procedures should include methods for maintaining and updating the plan to reflect any
significant internal, external or systems changes. The procedures should allow for a regular review of
the plan by key personnel within the organization.
The disaster recovery plan should be structured using a team approach. Specific responsibilities
should be assigned to the appropriate team for each functional area of the company.There should be teams responsible for administrative functions, facilities, logistics, user support,
computer backup, restoration and other important areas in the organization.
The structure of the contingency organization may not be the same as the existing organization chart.
The contingency organization usually structures teams responsible for major functional areas such as:
• Administrative functions
• Facilities
• Logistics
• User support
• Computer backup
• Restoration
• Other important areas
The management team is especially important because it coordinates the recovery process. The team
should assess the disaster, activate the recovery plan, and contact team managers.
Management team members should be the final decision-makers in setting priorities, policies and
procedures. Each team has specific responsibilities that must be completed to ensure successful
execution of the plan. The teams should have an assigned manager and an alternate in case the team manager is not available. Other team members should also have specific assignments where possible.
8. Develop testing criteria and procedures
It is essential that the plan be thoroughly tested and evaluated on a regular basis (typically annually).
Procedures to test the plan should be documented. The tests will provide the organization with the
assurance that all necessary steps are included in the plan.
9. Test the Plan
After testing procedures have been completed, an initial test of the plan should be performed by
conducting a structured walk-through test. The test will provide additional information regarding any
further steps that may need to be included, changes in procedures that are not effective, and other
appropriate adjustments. The plan should be updated to correct any problems identified during the
test. Initially, testing of the plan should be done in sections and after normal business hours to
minimize disruptions to the overall operations of the organization.
10. Approve the plan
Once the disaster recovery plan has been written and tested, the plan should be approved by top
management. It is top management’s ultimate responsibility that the organization has a documented
and tested plan.
=========================
If you have questions or concerns about your particular situation, please e-mail me at tpsynder@xantrion.com.. I will use your input to direct future columns.
=========================
 |