RTO is more than data recovery


May 23, 2006

RTO represents the maximum acceptable time to recover, not the best-case scenario
Computerworld Opinion by Jim Damoulakis

http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=storage&articleId=9000694&taxonomyId=19

Jim DamoulakisLast week's column (see: "The data recovery expectations gap") discussed the gap that often exists between end users and IT infrastructure, and suggested the need to distinguish between operational recovery and disaster recovery. Even with this distinction, there are some additional misconceptions regarding recovery time objective (RTO) that merit consideration.

In storage assessments, we often come upon RTO metrics that seem to be unrealistically short -- the achievability of RTOs of less than four hours, in particular, are highly suspect. Upon further probing, it turns out that we are dealing with differing perspectives of what constitutes successfully achieving an RTO. To the storage person, the RTO is achieved when the data has been restored. This, of course, is a somewhat parochial view and not one that others who are actually dependent on the data are likely to endorse.

A true RTO encompasses the entire time span from the point that an outage occurs to the point where users can resume operation of an application and its associated data. Overall application RTO consists of several subcomponents, including:

  • The time to detect an outage
  • The time to bring up an alternative environment
  • The time to bring up systems and communications
  • The time to recover data
  • The time to restore and start the application
  • The time to verify that the application and recovered data is functioning properly and the data is valid
  • The time to restart user access.

Of course, not every one of these steps is required in all recovery scenarios, but they all need to be considered when committing to an RTO. Essentially, an RTO represents the maximum acceptable time to recover, not the best-case scenario.

I'm not suggesting that RTOs under four hours are not achievable, but they require a well-defined and understood process and investment in the right technology. Also, since data recovery is a critical, and potentially time-consuming, element of overall recovery, it may make sense to define an additional metric -- a data RTO -- to ensure that that key component is successfully completed.

Jim Damoulakis is chief technology officer of GlassHouse Technologies Inc., a leading provider of independent storage services. He can be reached at jimd@glasshouse.com.

Web Development By DLG Results