disaster recovery documentation

No step is more critical for success in the field of Information Technology than documentation. It is squarely in the middle of the System Development Life Cycle, but is usually left until after testing and deployment. However, when it comes to a successful Disaster Recovery strategy, documentation is even more important than it is for standard IT functions.

While most companies have at least some plan for data backup and recovery, a disaster recovery strategy requires more than just being able to restore lost data.

A disaster recovery plan does not just entail having a backup plan for the data in your servers. It is the overarching document that guides your employees and partners in restoring functionality to your business. Because your technology and systems are constantly being updated or changed, this plan has to be a “living document.”

Changes to systems, teams, and architecture must be accounted for in your disaster recovery plan as they happen. This information must also be available to everyone responsible for any part of the recovery process. Let us dive into this a little further and we’ll help you develop a roadmap for documentation of your disaster recovery plan.

The Elements of a Good Plan

A good disaster recovery plan document will be detailed, kept up-to-date with current information, and accessible by anyone who needs to refer to it in the case of a disaster. The elements will vary according to your company’s structure, the type of business you have, and what services you have supported by partners like your compliant cloud and colocation provider.

The following is a short list of essential elements that you need to include at a minimum:

Communication and Roles

Who does what and how to get ahold of people are the two of the most essential elements in the immediate aftermath of a disastrous event. Contact information for all employees and providers that are essential to the disaster recovery needs to be kept up-to-date and readily accessible. Another thing that must remain current are each team and team member’s roles. In case of an emergency event this information must be clearly outlined.

Schematics

Diagrams of the equipment, infrastructure, and data flow can be an essential part of any necessary restoration or rebuilding.

Systems and Asset Inventory

The Systems and Asset Inventory covers physical assets, like servers and laptops. This also includes agreements with providers and agreements with vendors. If you are outsourcing your primary IT and data to a hosting provider, you will have a shorter list of actual assets, but will need to know exactly what your agreement provides.

Application Dependencies and Prioritization

Detailing which applications interact with others is essential to the plan. Your plan should list the application you need to restart first, identify the apps that are mission critical and those you can delay restarting, as well as, the level of priority to recover each. Once determined, these should be outlined in both your internal and external Service Level Agreements (SLAs). You will also need to have a step-by-step roadmap for your administrators to follow, so that systems like point-of-sale payment or customer-facing applications are restored quickly, while those that can be delayed slightly are moved lower down on the list.

RTO and RPO

Recovery Time Objective (RTO) is the “deadline” in a disaster recovery situation. It is determined by evaluating how quickly your system needs to be back online when something goes wrong. Your backup and replication strategy and schedule will determine how recent the data you are restoring will be. You want to make sure the latest backup is not older than the Recovery Point Objective (RPO) you have set. Think about the potential re-work required when you determine RPO. It will be different for every business and sometimes different for individual applications.

Regulatory Compliance

After disaster recovery events, most industries have regulatory obligations regarding reporting, documentation, and future protection against further instances. If your business is subject to regulations which require reporting after an outage or breach, this is a must include. These regulations may include HIPAA/HITECH, PCI DSS, ISO 20000-1, ISO 27001, or SSAE 18 SOC 1, SOC 2, and SOC 3.

Different Degrees of Disaster

The worst-case scenario for a business disaster is obvious, some catastrophic event physically destroys your primary site. Whether a tornado, fire, or man-made disaster, most people can easily understand “what if everything at our primary location were suddenly wiped out?”

There are varying degrees of failures that can occur with your business’s mission critical systems and data that might not be as immediately evident. Partial loss of or corruption of data, security breaches, temporary service outages, even loss of personnel who play key roles can constitute a disaster that can impact your day-to-day operations.

If you are using a hosting provider for all or part of your solution, that can mitigate the impact of all of these. But an additional key component is proper documentation, especially in the last case.

Good Documentation Versus Tribal Knowledge

When key IT personnel are leave your business, whatever information they have not documented often goes with them. Undocumented information about systems, procedures, or necessary business information is often referred to as “Tribal Knowledge.”

“A set of unwritten rules or information known by a group of individuals within an organization but not common to others that often contributes significantly to overall quality. Tribal knowledge may be essential to the production of a product or performance of a service but may also be counterintuitive to the process.”—BusinessDictionary.com

When key employees leave, having solid documentation about the systems and business processes enables you to seamlessly onboard someone else in that position. In the case of Disaster Recovery, where time is a crucial factor, you do not have the luxury time.

It is much harder to recreate any Tribal Knowledge that is required to get mission critical systems and data to a point of minimum viable recovery. It is imperative that clear, easy-to-access documentation of the steps, operations, and responsibilities of each person be updated regularly and maintained.

Not Once a Year, but All the Time

Again, any disaster recovery plan needs to be a “living document.” Your business needs, technology, and the infrastructure is always changing. Effective documentation creates a roadmap for your employees to follow in the event something affects the data or systems that support your business’s livelihood.

Every missing step in the process will result in lost time and require backtracking and recreation. Even if 100 percent of your IT is outsourced, someone needs to know who to call and whose job that is if the worst were to happen.

Who to Call for Help

How would you call to help strengthen your disaster recovery plan? Your answer should be LightEdge. Now that modern IT practices have started to blend physical with virtual, and cloud with on-premises, safeguarding your applications and data requires several tools and methods.

LightEdge is committed to keeping our customers’ IT operations, critical applications, and data protected. We provide the technology and resources our customers require to get back to a production state that meets their RTO and RPO requirements.

LightEdge offers a comprehensive set of disaster recovery solutions to ensure uninterrupted performance of IT operations and mission-critical systems in the event of a disaster.

The reliable availability of business IT is essential to the management and livelihood of every company, large or small. All elements hinge on the dependability of your technology to deliver vital information right when you need it.

Redundancy is built into each of our data centers located in Des MoinesKansas CityOmaha, Austin and Raleigh  facilities. Each of our LightEdge facilities strive to deliver more than traditional data centers. We have created true Hybrid Solution Centers designed to offer a complete portfolio of high speed, secure, redundant, local cloud services and managed gateways to public clouds through our hardened facilities.

Want to learn more about LightEdge’s disaster recovery and business continuity services? Contact one of our disaster recovery experts to get started or to schedule your private tour of any of our data center facilities. We have disaster recovery, colocation, and business continuity experts standing by to answer any of your questions.


Related Posts