So the cloud hype train rolls on and we’re all constantly being told how the cloud can help cut costs, increase agility and reduce time to market. The cloud certainly has its advantages and for SMB’s and start ups with little or no ‘IT baggage’ the cloud is an attractive proposition. However, for most enterprises a transition to cloud computing is not something that should be undertaken lightly. Today, a large number of cloud solutions exist in the market place, providing great choice; but this leads to a complex decision making process. Broadly, three core deployment models exist – Infrastructure as a Service (IaaS), Platform as a Service (Paas) and Software as a Service SaaS. These models are typically provided from an internal (private) cloud, external (public) cloud or both. For more on cloud definitions and deployment models, I recommend reading this article by the National Institute of Standards and Technology (NIST).
Whichever cloud deployment model route an organisation decides to take, it needs to decide whether it wants to use a private or a public cloud, or even a combination of both. For this decision, there are lots of factors to take into account. Focussing predominantly on IaaS, below I highlight some of the factors that are likely to prohibit enterprises from taking advantage of the two main cloud types and a potential solution to those challenges.
The Public Cloud
In the IaaS space, the public cloud lends itself heavily to the SMB and start-up market because of smaller user numbers, fewer SLA’s, off the shelf applications and only basic security compliance needs. However, for medium to large enterprises, apart from SaaS, the public ‘multi-tenant’ cloud is seen as too high a risk for the majority of their systems for the following reasons:
· Security and regulatory compliance
· Lack of enterprise grade features such as DR and backup
· Lack of performance based SLAs
· Complex transitions and migration paths
· Lack of standards (portability)
· Data versus server locations
· Reliability.
Although the public cloud is generally unsuitable for enterprise production systems, there is no doubt it can be an appealing proposition for test and development environments. This is where provisioning can be achieved in seconds, security compliance is often less of an issue and the ability to scale down as well as up is commercially very attractive.
The Private Cloud
So if enterprises aren’t moving their production systems into the public cloud, how can they take advantage of the commercial and operational benefits that cloud computing promises to deliver? There has been a lot of talk over the last 12 months about the ‘private cloud’ where enterprises essentially look to introduce cloud methodologies into their own IT organisation. Unfortunately more often than not, virtualisation is being confused with cloud computing when actually virtualisation should only be seen as one of enablers for cloud.
At the recent International Cloud Computing Conference And Expo in Santa Clara, US, a number of large enterprises including the CIA presented on their approach to creating a private cloud and the challenges they faced along the way. The same underlying message came from all speakers – developing a private cloud takes time, significant investment and requires high levels of automation in order to achieve the required ROI. Some of the other challenges that can be expected are listed below:
· Initial CAPEX and ongoing infrastructure refresh
· Buy in at all levels
· Extensive planning
· High levels of automation
· Significant operational investment
· Complex tool and platform selection
· Limited in-house skills and time.
In the case of the CIA, it has the size of IT infrastructure that enables them to provide the economies of scale associated with cloud computing, whilst its strict security requirements meant its only option was to develop a private cloud.
The Virtual Private Cloud……..or why not Federate!
It would seem for most enterprises the IaaS Public cloud is still too immature and the initial capital/operational expenditure and time required to develop a true Private cloud potentially outweighs the required ROI. It’s not all bad news however, many hosting providers and Telco’s are bringing enterprise grade Virtual Private Cloud offerings to market. These offerings place the burden of Capex and Opex onto the service provider but provide the end user with utility computing aligned to needs of the enterprise.
In reality there is no one solution that fits all. Over the next few years more and more organisations will adopt a federated model, taking advantage of SaaS out of the Public cloud and IaaS from virtual private clouds
-Tom Brand, GlassHouse Technologies (UK) Virtualisation Practice Lead
Applications and the information they hold are increasingly the lifeblood of many organisations. In my experience, many businesses which have encountered major loss of data are never able to reopen; some attempt to but do not succeed and only a handful survive. This just goes to show the long-term importance of a business aligned DR strategy today.
For an enterprise-wide DR programme to be truly effective the IT requirements need to reflect business needs and the value of the assets that are being protected. Once an organisation can capture what assets they have, the external and internal risks they need to be protected from, and how to maintain the accuracy of those assets, an analysis of the impact of loss needs to be performed.
The understanding of ‘actual business needs’ and impact of loss can be determined through a Business Impact Analysis (BIA). This identifies how much downtime you can actually afford and therefore, what level of protection you can ‘get away with’. The amount of downtime an organisation can ‘really afford’ is directly related to the financial, legal or public relations impact an application has on the organisation in the event of its unavailability. During the BIA a pain point is established whereby the impact of the application unavailability significantly spikes. This will provide the time key which identifies the amount of downtime your organisation can afford.
These well known metrics are the Recovery Point Objective (RPO) and Recovery Time Objective (RTO), both of which have a direct impact on the amount of investment it takes to protect the application. The more aggressive these metrics are, the more expensive the infrastructure required to protect them. Getting this impact versus cost balance right is one of the biggest challenges facing organisations when implementing and maintaining DR capability.
Another challenge for businesses creating a DR strategy is that they have to decipher the tangible and intangible impacting factors and deal with emotive and unrealistic views on the impact of applications. For example, human nature dictates that ‘my’ application is the most important and so requires the highest level of protection. The consequence of that is huge expenditure required to ensure minimal downtime for applications that do not have the business impact to warrant such investment.
Until recently the complexity of determining the impacting factors has played a significant role in DR being reserved for only the most critical applications. Now, technological enhancements have meant the cost of delivering DR and recovery objectives has become cheaper and more commonplace, but effective understanding of application unavailability remains, at best, confusing.
A DR strategy is paramount for any organisation that operationally or legally relies on its applications and the information they hold. Without one the interruption of applications could spell disaster for you personally and or your organisation. Getting the balance right for downtime versus the level of protection you could get away with will result in an effective DR strategy that matches the value of the application to the cost of the infrastructure to protect it.
-Simon Johnson, GlassHouse Technologies (UK) Disaster Recovery Practice Lead
Disaster recovery (DR) is, by its very nature, difficult to plan for. But we’re all well aware of the problems associated with insufficient DR processes, policies and procedures. If a business’ IT infrastructure cannot recover from a ‘disaster’ quickly the implications can be extremely costly.
Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are key measurements an IT manager needs to make the business aware of and provision for downtime accordingly. Both are recovery metrics that are calculated in time which provide quantifiable figures used to understand the tolerance levels of the business for application downtime and data loss.
RTO measures the maximum amount of time that is needed to recover from disruption and for the business to be operational again. The more aggressive your RTO, the shorter the critical time period to restore the system to normal functioning. This, inevitably means more financial investment is required in high availability infrastructure, but perhaps a small price to pay in the long run if something does go wrong. There are many technology options to consider including various clusters or complete redundant infrastructure and data replication on or offsite.
The RPO looks at the maximum amount of data loss acceptable in the event of a disruption. A business will ask itself “how much can we afford to lose”. For example, if there is a nightly backup at 21:00 and the system fails at 07:30 the following day, the system will have lost all data modifications since the backup at 21:00 the previous night. The question is – is that loss acceptable to the business?
Like RTO, the more aggressive the RPO, the greater the financial investment in infrastructure is required to meet the objective in a shorter period of time.
Some businesses - or areas within a business - may not be able to tolerate RTOs and RPOs of any longer than a few hours, while others may be able survive downtime for periods of, say a week with minimal impact. These requirements can normally be determined by the Service Level Agreements (SLAs).
For years businesses and their IT departments have struggled to understand and communicate effectively with each other, resulting in either significant under or over investment in both operational and disaster recovery application protection. Accurate RPO and RTO metrics have helped bridge this gap and, combined with business impact analysis, facilitate the alignment of applications to correct data protection levels and generate the accurate levels of investment to protect data.
SLAs are unachievable unless a business has the capabilities to deliver them. Organisations need to understand how and where data protection is delivered in order to optimise operations and meet the SLAs. Although they typically play significant roles, backups, snap shots and mirrors do not solely deliver RTOs and RPOs. Many levels of resilience throughout the IT supply chain combine to deliver recovery capabilities. These must all be accurately measured to generate the RPOs and RTOs. These quantifiable objectives translate requirements into tangible metrics which facilitate the selection of infrastructure to enable effective achievement of the SLAs, even in an unforeseen disaster situation.
-Simon Johnson, GlassHouse Technologies (UK) Disaster Recovery Practice Lead
If one was to believe the statistics presented by some vendors, you’d be forgiven for thinking the vast majority of organisations are already running extensive levels of virtualisation across their production environments. In reality, however, many organisations have actually only virtualised the low hanging fruit or are still is the process of piloting their virtual infrastructures. One of the first barriers to virtualisation roll-out is often a lack of understanding of the differences between a Proof of Concept (POC) and a Pilot.
A POC is typically a partial and often standalone solution used to establish that a concept or system satisfies some aspect of the requirements for the complete solution. The proof of concept implementation will not affect business operational data although it may integrate with existing business systems to some extent. In many environments pilots are actually more like POCs, but unfortunately the pressure to reduce cost and rapidly deliver new services has forced the POC infrastructure to become integrated with production bypassing the wider scope of planning that should be undertaken.
The purpose of a pilot project is to test, usually in a production environment, whether the system is working as it was designed while limiting business exposure. The transition from running a pilot to virtualising the wider environment shouldn’t be a leap of faith because sufficient design, development and planning should have been undertaken, and here lies another barrier. The design and planning required for the pilot should in effect be treated exactly the same as deploying the production environment. When a successful pilot has been completed, more often than not, it will simply be rebadged as production and expand accordingly.
At a high level, technology and operations are both key aspects that need to be planned and tested carefully in order to ensure the transition from pilot to production is a strategic success. Virtualisation pilots often tend to be very technology orientated when, in fact, there should be just as much focus on the operational elements associated with successfully managing the virtual infrastructure. These operational processes, such as change management, capacity planning, virtual machine (VM) lifecycle management and chargeback, have to be in place during the pilot and have the ability to scale into production. From a technology point of view, organisations must look beyond the hypervisor and address all the components of the infrastructure that virtualisation has an impact on; such as networks, backup, storage and disaster recovery. Many pilots simply test the smaller, easy to virtualise, candidates and only focus on performance at the application and operating system layer which often produces unrealistic results.
Organisations must have test strategies that include the full range of potential configurations. This will ensure the infrastructure has the capacity to scale in order to meet the demands of larger workloads as and when they are virtualised. The classic example of this being storage input/output, where cheaper storage technologies are implemented and VMs perform as expected during the pilot but performance can decline significantly once the infrastructure is loaded or VMs with heavier workloads are introduced.
With an operational strategy and ‘bigger picture’ approach to virtualisation technology planning organisations won’t need to making a leap of faith, they can just cross the bridge to a better place.
-Tom Brand, GlassHouse Technologies (UK) Practice Lead
With tightened budgets, businesses are constantly looking for way to see a rapid return on investment (ROI). This has increased interest in the adoption of pay-as-you-go cloud services and virtualisation technologies where the ROI can be very attractive. In recent months, Tom Brand, virtualisation practice lead at GlassHouse Technologies has found himself frequently answering the question: “What is the difference between cloud computing and virtualisation?” In Tom’s latest blog post he gives his view…
In order to answer this question it is first important to clarify what the two terms, cloud computing and virtualisation, actually mean: According to the National Institute of Standards and Technology (NIST), cloud computing is a pay-per-use model for enabling available, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
Virtualisation is a technique used to abstract the physical characteristics of computing resources from the systems, applications or end users that interact with those resources. Virtualisation technologies typically let a single resource (such as a server, an operating system, an application, or storage device) appear as multiple logical resources; or makes multiple physical resources (such as storage devices or servers) appear as a single logical resource.
In analysing the definitions above, trying to compare cloud computing and virtualisation is similar to comparing a car to an engine respectively. A car is a complex system including parts, interfaces, inputs and an engine that function as one to provide the best and most efficient drive for the owner. Like the car, cloud computing (public or private) is essentially the coming together of technologies, operational processes and financial models to provide organisational flexibility with optimum cost-efficiency. Continuing with the automotive analogy, the engine of a car is a core component because without the engine, the car won’t move, regardless of whether there are any seats in it. With cloud computing, virtualisation is the core component enabling the majority of characteristics required to make any cloud computing model work.
Going one stage further, you can compare the cloud to a cost-effective metered taxi service, always at your disposal. You now have a range of highly efficient vehicles that can be requested whenever you need to travel. They are operated and maintained by someone else and you only pay for the length of the journey (paying-as-you-go) with the ability to get out whenever you like. Although cloud computing encompasses a large range of compute services, typically labelled as either infrastructure (IaaS) or applications (SaaS and PaaS), they all fit within the taxi model. End users only pay for the services or resources they use, the service and ability to provision is always available. Users have the flexibility to request different services however the underlying infrastructure is not their concern and cannot be modified. In summary, virtualisation improves IT efficiency - enabling traditional computing with fewer resources; whereas cloud computing improves IT effectiveness - empowering more people to build services with more flexibility and fewer experts. When implemented accurately, both technologies can provide attractive an ROI for the IT department.