Advertisement
Promo

Server platforms Toolkit

System availability targets

Change Tech Solutions, Inc

Published: 06 Feb 2003 13:54 GMT

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

To accurately measure system availability as experienced by end users, you must first thoroughly understand the system's configuration. This includes all the components and resources used by the application, both local and remote; and the hardware and software components required to access those resources. The next step is to monitor all these components for outages, then calculate end-to-end availability. Here's how to do these calculations.

Quantifying availability targets
To quantify the amount of availability achieved, you have to perform some calculations: Committed hours of availability (A)This is usually measured in terms of number of hours per month, or any other period suitable to your organisation.Example: 24 hours a day, 7 days a week = 24 hours per day x 7 days x 4.33 weeks per month (average) = approximately 720 hours per month

Outage hours (B)This is the number of hours of outage during the committed hours of availability. If high availability level is desired, consider only the unplanned outages. For continuous operations, consider only the scheduled outages. For continuous availability, you should consider all outages.Example: 9 hours of outage due to hard disk crash, 15 hours of outage for preventive maintenance

Next you can calculate the amount of availability achieved as follows:Achieved availability = ((A-B)/A)*100 percent

For the statistics in the examples above, here's each calculation:

  • High availability = ((720-9)/720)*100 percent = 97.92 percent availability
  • Continuous operations = ((720-15)/720)*100 percent = 98.75 percent availability
  • Continuous availability = ((720-24)/720)*100 percent = 96.67 percent availability
When negotiating an availability target with users, make them aware of the target's implications. Table A shows availability targets versus hours of outage allowed for a continuous availability level requirement.

Table A

Continuous availability target Hours of outage allowed per month
99.99%
0.07 hours
99.9%
0.7 hours
99.5%
3.6 hours
99.0%
7.2 hours
98.6%
10.0 hours
98.0%
14.4 hours

It is important to recognise that numbers like these can be difficult to achieve, since time is needed to recover from outages. The length of recovery time correlates with the following factors:

Complexity of the system: The more complicated the system, the longer it takes to restart it. Hence, outages that require system shutdown and restart can dramatically affect your ability to meet a challenging availability target. For example, applications running on a large server can take up to an hour just to restart when the system has been shut down normally, longer still, if the system was terminated abnormally and data files must be recovered.

Severity of the problem: Usually, the greater the severity of the problem, the more time is needed to fully resolve the problem, including restoring lost data or work done.

Availability of support personnel: Let's say that the outage occurs after office hours. A support person who is called in after hours could easily take an hour or two simply to arrive to diagnose the problem. You must allow for this possibility.

Other factors: Many other factors can prevent the immediate resolution of an outage. Sometimes an application may have an extended outage simply because the system can't be put offline while applications are running. Other cases may involve the lack of replacement hardware by the system supplier, or even lack of support staff. I have seen many availability targets missed simply because a system supplier could not give due attention to the problem and no backup system supplier existed.

Be aware, you won't get precise measurements for every user's availability experience. That's not realistic. Just recognise that users do have availability requirements to which you must pay attention. Don't get too dependent on technical measurements for rating your performance. In the end, what matters most is that users are happy with the service that the IT organisation provides.

The Harris Kern Enterprise Computing Institute is a consortium of publications -- books, reference guides, tools, articles -- developed through a unique conglomerate of leading industry experts responsible for the design and implementation of "world-class" IT organisations.


For a weekly round-up of the enterprise IT news, sign up for the
Enterpise newsletter.

Find out what's where in the new Tech Update with our
Guided Tour.

Tell us what you think in the
Enterprise Mailroom.

  • Email
  • Trackback
  • Clip Link
  • Print friendlyPrint with EPSON

Did you find this article useful?
39 out of 88 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:












Video icon

Video

Microsoft Futures

Windows 7: Mixed reviews from PDC attendees

As developers received their copies of Windows 7 on Tuesday, they offered varied reactions to the Microsoft operating system update More

Microsoft floats clouds on Windows Azure

At the Professional Developers Conference, Microsoft announced the Azure Services Platform, the company's cloud-computing platform More

Ozzie: Success of Azure comes down to trust

In an interview, Ray Ozzie says businesses will be taking a risk by placing core operations in Microsoft's datacentre, but that the software giant has more to lose if things go bad More


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters