Dealing with downtime: Hardware Reliability Needs to be Measured

Don’t confuse the terms server uptime and server availability, they are two different things. Your servers could be running fine but are not available to the users because a component in your network a router, firewall or WAN equipment could have failed, this counts against server availability. By selecting servers with dual power supplies and multiple network cards you can increase their reliability, however to really achieve a H/A network make sure you install two or more load balancers configured in high availability mode.

Defining the downtime rules

If you ask an IT Manager about the permitted levels of down time the organization targets the reply needs to be more than just a percentage for example 99%. Actual downtime values set on an annual basis are as follows:

  • 99% = 87 hours 35 minutes
  • 99.9% = 8 hours 45 minutes
  • 99.99% = 52 minutes 35 seconds
  • 99,999% = 5 minutes 16 seconds

The cost of minimizing your permitted downtime varies server by server and is more complex because different server functions have a different level of criticality. A print server going off line is more likely to be annoying than critical, however it is a different matter if your mission critical database server fails as the damage to the business is immediate. You should bear these different levels of criticality in mind as you estimate the costs for raising the reliability of your systems because of it will cost you $95,000 to raise your reliability on a server from 99.99% to 99.999$ but your business would only loose $1,000 a minute thanks to downtime the investment does not make a good return.

Perhaps the most intelligent method of measuring the server performance is not whether it can handle 80, 100 or 200 sessions simultaneously but the effective time it takes users to complete their transactions. If you offer an ecommerce site where the percentage of users who can complete their transactions at peak traffic periods is too low it is not the number of users who can connect but the number who are unable to complete their purchases successfully that should be the point you care about and intend to resolve because your servers can still be running but you are losing revenue as disappointed potential customers abandon your site. 

 

Tags: 

No comments available.

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.