by Roger Kay

IBM, Lenovo and Cisco take top marks in server reliability

Opinion
Jun 13, 2017
Data CenterServers

Oracle continues to disappoint managers dealing with aging hardware and looking for greater reliability, support and uptime.

2 data center servers
Credit: Thinkstock

The same handful of companies won the sweeps again this year in the Information Technology Intelligence Consulting (ITIC) global survey of datacenter server reliability. IBM, Lenovo, and Cisco took top marks, while Oracle fared particularly poorly, based on an aging hardware base and defections to other platforms. Dell, Hewlett-Packard (HP), Toshiba, Fujitsu and Stratus had mixed results.

The study, executed a year ago and updated in May 2017, was undertaken without vendor sponsorship. The survey polled IT managers from 750 global businesses in 45 vertical markets and all major geographies. Respondents answered a set of multiple-choice questions and filled out an essay. Results were validated via two dozen personal interviews.

In addition to measuring vendor performance on reliability measures, the study noted that the cost of downtime has increased, with, on average, an hour of system outage docking a company more than $1 million. Respondents said the main technology threats to reliability were security, employee-owned devices and mobility in general. Other key factors affecting reliability included human error, complexity and increased workloads on aging hardware.

Highlights from the study (lightly edited): 

  • Among the various offerings, IBM’s z Systems Enterprise garnered “best in class” for reliability, accessibility, performance, and security among all server platforms. 
  • IBM and Lenovo hardware and various Linux distributions were either first or second in every reliability category, including virtualization and security. 
  • Lenovo x86 servers achieved the highest reliability ratings among all competing x86 platforms. 
  • Users rated Lenovo tech support best, followed by Cisco and IBM. 
  • Around 66% of respondents said hardware 3 ½ years old or more had a negative impact on server uptime and reliability, an increase from the 2014 survey. 
  • Reliability continued to decline for the fifth year in a row on HP ProLiant servers and Oracle’s SPARC and x86 hardware and Solaris OS. Reliability on the Oracle platforms declined slightly, mainly due to aging hardware. Many Oracle hardware customers are eschewing upgrades, opting instead to migrate to rival platforms. 
  • While 16% of Oracle customers rated service and support poor or unsatisfactory, only 1% of Cisco, 1% of Dell, 1% of IBM, 1% of Lenovo, 3% of HP, 3% of Fujitsu, and 4% of Toshiba users gave those vendors poor or unsatisfactory support ratings. Dissatisfaction with Oracle licensing and pricing policies has remained consistently high for the past three years. 

To a question about the impact increased workloads have had on reliability, availability, and uptime, only one-third said increased workload “has had no impact” on reliability.   Stated the other way, two-thirds have been affected. Essentially, the reliability requirement has gone up. In other words, the number of dollars lost from an outage is rising as the workload increases, the expected outcome. 

A quarter of respondents said the increasing size and complexity of the data center workload has had a small but noticeable effect on reliability. 

In the unplanned downtime category, IBM ran away with it, as the z Systems Enterprise topped all other hardware by a mile in the mid-length-outage segment, with only 1% of respondents reporting downtime of up to four hours. z Systems Enterprise did even better in the long-outage segment, with no respondents reporting any greater than four hours. By the same measure, Oracle did the worst, with 13% of its x86 platforms experiencing long outages in the past year. 

In terms of length of downtime calculated in minutes per server per year, IBM again stole the crown at .96 minutes. That’s less than one minute per server per year. As an operating environment, Linux seemed to fare better than others, which could generate an interesting commentary on open source development. HP was close to par with its Linux-on-Superdome combination, but fell to the bottom of the pack with its ProLiant x86 machines. Both top execs of the now-two-headed HP just made numbers eight and ten on the most-hated-CEOs list in a Forbes staff-written article in late May, based on survey data. It could be coincidental. Or maybe not. 

In the planned downtime bakeoff, no surprise, IBM won again, with zOS hitting the top in fewest hours per month of planned downtime by a long mile, followed by a pack of equals: IBM’s AIX offering, its RHEL/SuSE package, and Lenovo’s System x. Although Windows did badly, but some versions of Unix, Linux, and Mac OS did worse. With 1.5 hours per month of planned downtime, Debian, the free server operating system, held down the bottom of the pack, which may or may not say something about the despoiling of the commons. 

Other findings in the study include: 

  • an increase in reliability requirement to the point where 79% of respondents now need 99.99% reliability, 
  • rising downtime costs, which now average more than $1 million per year, and 
  • both the highest reliability requirements and the greatest average hourly downtime cost ($9.3 million) are found in the banking and finance sector. 

None of these results are particularly surprising. As more computing shifts to the cloud, data center workloads are increasing in size, complexity, and mission criticality. This shift correlates perfectly with the rising cost of downtime and the need for greater reliability. It’s clear from the data that IBM’s z Systems Enterprise, which is designed for such workloads, beats all comers in data center workload computing.