Blog

5 Ways to Prevent Cloud Outages

October 11, 2018

Lightedge

Author

Today, most organizations depend on some type of cloud service to run their business successfully. As a result, cloud service providers are expected to deliver 24/7/365, uninterrupted services to these businesses. In the digital era where information is constantly available, there is never a good time for a cloud outage.

In 2017, some credible cloud providers like IBM, Amazon Web Services (AWS), Google, and Apple all experienced cloud outages. While everyone, including cloud service providers, can fall victim to outages, there are some straightforward steps you can take to keep IT systems, up, running, and secure.

What Compensation Does Your Cloud Service Vendor Provide?

In February 2017 when an employee at Amazon Web Services tried to debug an S3 storage system, they accidentally typed a command incorrectly and many major enterprise platforms were down for four hours. As a result, these companies wondered if they would be compensated for their loss of revenue due to the downtime.

Cloud service providers are all different. It is important to understand the SLA before agreements are signed. Some cloud vendors only offer minimum infrastructure and you are in charge of recreating your own redundancy to protect your data from downtime in their cloud. As you can expect, these providers are less expensive, yet put the task of managing data availability on you.

Examine your SLAs closely to understand the level of compensation and coverage that your cloud provider offers. SLAs should cover you if your performance falls below what was guaranteed by your cloud vendor.

Your cloud service provider should be able to share their risk mitigation strategy for cloud outages, and explain to you how their infrastructure is designed for uptime. Redundancy and security should be built into every detail of the vendor’s infrastructure. If not, it is time to find a new cloud service provider.

1. Adopt a Redundant Multi-Cloud Environment

Enterprises that spread their workloads across multiple locations add redundancy to their environment and greatly reduce the risk of downtime. Your business should never depend on a single point of failure. Adopting a multi-cloud, multi-location IT approach will reduce risk and provide greater flexibility to adapt to changing business needs.

Seek out cloud service vendors with data center facilities in multiple locations that can spread your workloads across geographies. With this type of environment, your IT department and service provider(s) can work together to designate a primary and secondary site for your critical data. Amazon’s 2017 outage showed their customers that many things could break when just one object storage zone had a problem. A single point of failure approach is not an adequate environment for successful businesses.

Redundancy should be built into your cloud provider’s data center facilities. Lightedge facilities are designed to weather nearly any conceivable incident with minimal to no downtime. Our colocation facilities have redundant isolated path power architecture for S+1 diversified electrical, multiple redundant utility transformers, generators, automatic transfer switches, main switch panels, UPSs, and PDUs.

Selecting a cloud vendor with cloud-to-cloud backups will give you peace of mind in having a secondary copy of your critical data that you can failover to in the event of an emergency. Investing in a cloud service provider with stable data centers across multiple power grids is the safest choice. Hosting in multiple locations can also increase productivity by distributing traffic to the region closest to the end user.

2. Make Security and Compliance a Top Priority

Cloud services are full of opportunity, but can also open avenues for hackers to exploit vulnerabilities and cause cloud outages. Your business must protect itself with the right security tools and processes. Choosing a cloud service provider that has an extensive security and compliance portfolio is important.

While many cloud providers claim to be compliant, not all of them have certifications and accreditation to prove it. Start by vetting vendors that take a multi-layered approach to physical and cyber security. Here are some examples of different layers of security to look for:

Physical Security

Multi-layer physical security is one way to prevent cloud outages. When it comes to your mission-critical data, physical security should be a top factor to consider. Here are a couple physical security factors to consider in a cloud service provider:

Monitoring Systems

Many times, facility infrastructure is equipped with advanced monitoring systems to provide additional security. Monitoring support to look for in a data center provider can include:

High-definition video surveillance of both the interior and exterior with archival support
Live technical monitoring by expert NOC staff
24/7/365 support from a live expert

Natural Disasters Protection

Your data needs to be stored in a place where it is safe from natural disasters, such as hurricanes, earthquakes, tornadoes, tsunamis, and floods. Any facility where cloud services are held should be built outside of a 500-year flood plain area to avoid flooding. Man-made issues such as the potential for terrorist attacks also need to be considered. In a less populated area, there is lower risk because it is less likely to happen.

Security and Compliance Measures

Cloud hosting providers that comply with specific industry standards demonstrate an adherence to industry best practices and procedures. While your organization might not be highly regulated, finding a compliant cloud provider can be helpful when vetting out vendors. Maintaining security and compliance in the cloud is one way to avoid cloud outages.

Whether an organization is regulated by HIPAA or PCI DSS, it is important to keep proprietary systems private in the migration process. Embracing security should be a major focus in your organization’s cloud migration checklist.

A careful review shows that relinquishing control and security are two major concerns for companies looking to move to the cloud. Thankfully, Lightedge’s top priority is keeping data secure and compliant while still giving you control as needed. In fact, many of our services meet the rigorous standards of:

ISO 20000-1
ISO 27001
SOC 1 Type II
SOC 2 Type II
SOC 3
HIPAA
PCI DSS

Here is a brief overview of important compliance standards your cloud service vendor should uphold:

SSAE 18
SSAE 18, or Statement on Standards for Attestation Engagement No. 18 establish requirements and provide application guidance to auditors for:

Performing and reporting on examinations
Reviewing processes
Agreeing upon procedure engagements (including SOC attestations)

SOC Reports
According to the American Institute of Certified Public Accountants (AICPA), SOC Reports are designed to help service organizations (data center colocation providers) build trust and confidence in the service performed and controls related to the services through a report by an independent auditor. Each type of SOC report is designed to help service organizations meet specific user needs.

ISO 20000-1 & ISO 27001
ISO 20000-1 is the international standard for IT Service Management (ITSM), published by the International Organization for Standardization (ISO) and the International Electoral Commission (ICE). ISO 20000-1 gives cloud service providers a framework to help manage IT, while allowing them to prove they follow best practices.

Mandatory steps for a cloud service provider to become ISO 20000-1 and ISO 27001 certified include:

Internal audits
Management review
Corrective actions
Documentation review
Main audit

ISO 20000-1 and ISO 27001 are some of the strictest and hardest to achieve compliance badges that a cloud service provider can obtain. Lightedge is one of the very few cloud service providers in the entire country to be in compliance with both ISO 20000-1 and ISO 27001.

HIPAA
According to the U.S. Department of Health & Human Services, The Health Insurance Portability and Accountability Act (HIPAA) created a national set of security standards for protecting certain health information that is held or transferred in electronic form. The Security Rule operationalizes the protections contained in the Privacy Rule by addressing the technical and non-technical safeguards that organizations called “cover entities” must put in place to secure patients electronic protected health information (e-PHI).

PCI DSS Compliance
According to the PCI Security Standards Council, PCI DSS offers robust and comprehensive standards and supporting materials to enhance payment card data security. PCI DSS provides an actionable framework for developing a vigorous payment card data security process. This includes prevention detection, and appropriate reaction to security incidents.

3. Test for Cloud Outages

Test, test, test. You will never regret being overly prepared for an emergency, so do not put off testing for failure. Cloud outages can be caused by external malicious attacks and insider threats or even a simple system update. Test and plan for everything, because most of the time an outage can be prevented.

Testing for failure can include everything from testing the viability of a response plan to a storage migration process. Quickly acting and responding to an incident makes a huge impact.

The cloud is a great place to test for failures because it is a staged environment. Organizations can replicate their systems in a staged layout to test production and analyze how it will perform in different situations.

It is also a good plan to verbally walk through the company’s recovery plan to build confidence in those who must carry out the plan.

4. Reexamine Your Communication Plan

In addition to testing for cloud outages, there should be a tested communication plan in place that fits together with your disaster recovery and business continuity efforts. Unfortunately, an often-overlooked part of mitigating cloud outages is the communication plan. In the event of an outage there should be an internal communication plan for employees and an external communication plan for customers or stakeholders.

This plan should be reviewed annually or quarterly, depending on your company needs. This current communications plan could save your business and your customers immeasurable long-term costs.

Communicating information during and following an incident to relevant parties is key when it comes to preventing or preparing for a cloud outage. It is important to consider backup communication methods. If your primary portal of communications was through a cloud platform, plan a secondary method of communication in case that portal experienced an outage. Internal alerts can be sent by email, overhead building paging systems, and by voice or test messages.

5. No Skimping When It Comes to Redundancy

The ROI of redundancy skyrockets in cloud outage scenarios. Critical data should be replicated across multiple availability zones and backed up. Many enterprises affected by the Amazon 2017 outage did not have a failover plan and it ended up costing them for not having a redundant back up plan.

Of course, redundancy has its costs. Businesses must determine if having a redundant plan in the event of a cloud outage is justified.

No More Outages

One of the biggest concerns about cloud services is the potential for loss of control and outages. While cloud services have many benefits and can be a critical part to any business’s operations, it is important to have a plan in place to mitigate the risk of an outage.

It is worth your time to learn about all the offerings that a cloud service provider has. Lightedge protects customers using our cloud services from top threats.
From simply ordering-up capacity in fully-managed environments to private clouds that are a customized configuration of infrastructure into any layer of the stack, Lightedge’s world-class facilities built to Tier III standards and talented engineers are ready to accommodate your business’ requirements.

Lightedge is committed to keeping our customers’ IT operations, critical applications and data protected. We provide the technology and resources our customers require to get to a production state that meets their RTO and RPO requirements.

Lightedge also offers a comprehensive set of disaster recovery solutions to ensure uninterrupted performance of IT operations and mission-critical systems in the event of a disaster.

Redundancy is built into each of our data centers in Des Moines, Kansas City, Omaha, and newly acquired Austin and Raleigh. Each of our Lightedge facilities strive to deliver more than traditional data centers. We have created true Hybrid Solution Centers designed to offer a complete portfolio of high-speed, secure, redundant, local cloud services and managed gateways to public clouds through our hardened facilities.

Want to learn more about Lightedge’s compliant cloud services? Contact one of our cloud computing experts to get started or to schedule your private tour of any of our data centers. We have security, compliance and cloud experts standing by to answer any of your questions.

Uncategorized
The Data Explosion and Hidden Data Storage Costs in the Cloud – Could Object Storage Be the Answer?
Colocation
Environmental risk: Considerations for Data Center Site Selection
Business Continuity
What is Business Continuity

5 Ways to Prevent Cloud Outages

5 Ways to Prevent Cloud Outages

What Compensation Does Your Cloud Service Vendor Provide?

1. Adopt a Redundant Multi-Cloud Environment