Virtualization and Cloud Advisor

Expert analysis and advice on server virtualization technologies, deployments and management.

RSS
All Posts | RSS

Our blogger: Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date.

Thu, October 15, 2009

Microsoft Sidekick Debacle and the Cloud: Lessons Learned

By Bernard Golden

Keywords: cloud, cloud computing, Sidekick, Microsoft, Bernard Golden, HyperStratus

CONNECTIONS
Microsoft
T-Mobile
HyperStratus
This week's cloud tempest is the very visible breakdown of Microsoft's Danger storage service for the T-Mobile Sidekick phone. An apologetic email (as reported by TechCrunch) first went out from Microsoft to users noting that all data had been lost with no way to recover it. Apparently, it now seems that some or most of the data will be recovered, which is, of course, good news. I don't know that Microsoft has provided any formal explanation of what went wrong, but most of the speculation I've seen identifies a failed SAN upgrade with no data backup available as the cause for the data loss.

People on all sides of the cloud debate have been debating this incident and treating it as though it is a proxy for the entire concept of cloud computing.

While it's unlikely that one should conflate this situation with the totality of cloud computing, there are some very, very important issues highlighted by this situation that are worth exploring and understanding.

Lessons to be Drawn

It's a cloud: Some writing I've seen on this incident downplay it because, in the view of the authors, this service isn't really a cloud offering. They say it's a limited application, or an adjunct service to a hardware device, or it's really a consumer service and therefore not a "real" cloud application because those are aimed at business users. That's baloney.

First of all, it is a cloud application. It certainly fits into the common SaaS definitions. The "it's really a consumer service" rationale won't wash, either. With the blurring of consumer and commercial use, what's personal to one person might be mission-critical to another. And trying to deflect concern about this incident by defining it away misses the point. Cloud computing is a big tent (if I may mix a metaphor), and one of its strengths is the fact that many different approaches can be considered as cloud computing. In any case, clever dissembling is beside the point. If it walks like a duck, quacks like a duck, trying to convince someone that it's not a duck because it's actually a similar looking, slightly different species is unlikely to be successful.

This attention bespeaks intense interest in the cloud: Let's face it, all the hullabaloo about this incident is good news, because it means people recognize cloud computing is an important development. You don't spend a lot of time worrying about something you don't care about. It's obvious that the concept of cloud computing has garnered attention, to which I attribute the fact that everyone recognizes that the old methods of running IT infrastructure are expensive and don't scale.

This incident represents a breach of best practices: Losing data is the greatest shortcoming an operations group can suffer. A service outage is bad, but losing data is inexcusable. In fact, calling this a breach of best practices is overstating it. The term "best practice" describes a set of processes performed by the leaders in a field, not the mainstream. Backing up data is data management 101; really, it's 01. If this incident is truly a result of failing to do a backup, it contravenes the basic, simplest practice of managing data. No matter what the cause, losing data is inexcusable.

It calls into question one of the tenets of cloud computing: The expertise of cloud providers. My company does not run its own email service; we use Google to manage our mail system. Is this because we don't know how to run a mail server? Of course not. We do it for a very simple reason: using Google allows us to focus on our core mission, serving our clients.

We are very aware of what would happen if we ran our own mail server. Every time there was a problem, we'd treat it like an inconvenient interruption, and do just the minimum to patch the problem and get back to our real work. We would never devote the full amount of time running a mail server deserves. Therefore, our mail service would always be fragile, subject to interruption, and (most likely) vulnerable to security penetration. So we turn to a company that can devote real resources to running our mail server, one that follows best practices, and one that can take the necessary time to do it right.

An article on CRN blamed the outage on the fact that Microsoft is working on another project and pulled engineers from Danger onto the other project. Frankly, this is, or should be, irrelevant from a user perspective. A cloud provider is running a service and has to be committed to operational excellence, despite any other distractions or competing priorities. Otherwise, it forces the customer to examine the internals of the cloud service. This, from the perspective of the customer, is impractical, since everyone has limited time to devote to these things—a problem which will only get worse, given the fact that we are moving to a world in which use of cloud services is rapidly multiplying.

Moreover, most cloud providers don't want a horde of customers insisting on auditing the service—the support required for customer audits is not scalable. Finally, a customer shouldn't have to examine the inner workings of the cloud service. One doesn't question how the local electric utility schedules its generator maintenance, why should it be necessary for a cloud service? Customers should not have to do detailed evaluations of a cloud service: it's the job of the service provider to ensure appropriate operational processes in place.

Whatever the reason for the data loss, it calls into question the tenet that cloud computing enables a better level of discipline and expertise to be devoted to a service offering. If a customer can't depend on a cloud provider to perform at a higher level than the customer could do on its own, why should it turn to the cloud?

Likely Outcomes of this Incident

Microsoft evaluates its practices throughout its cloud offerings: I guarantee that one outcome of this incident is that an edict came down from on high: "Make sure no other system is vulnerable to this problem!" There are undoubtedly a bunch of operations groups at Microsoft digging through backup practices to ensure redundant data is stored and that reliable backups are being performed. Also undoubted is the response of these groups: "how come we're being stuck with a ton of extra work because they screwed up?" Fellas, that's just the way organizations work.

Other cloud providers use this as a "teaching moment": While these cloud companies are wiping their hands across their foreheads in relief, thinking "there but for God's grace go I," senior management is regarding this incident as an inexpensive way to learn an important lesson, and are taking it as an opportunity to do a low-risk drill. Of course, if other Microsoft operations groups resent having to do work because of this incident, imagine how ops groups in other companies feel!

Microsoft's credibility suffers a short-term hit: Some people will generalize this situation to all of Microsoft's offerings, and be more cautious about using them. Let me be clear: I don't believe this situation represents Microsoft's typical operations practices. Hotmail is a far larger service, and I don't recall hearing anything like this happening with it. Nevertheless, Microsoft's overall cloud reputation will be tarnished for a while.

The best thing for Microsoft would be to treat this as crisis management event, and follow the established playbook: early apologies, full transparency, frequent updates. That still won't prevent people from re-evaluating their opinions, at least in the short-term, but it will help return those initial re-evaluations back to their long-term assessments more quickly.

Cloud computing in general suffers a short-term hit: Any time one market participant suffers a significant blow, the concern spreads to others. All cloud providers are going to be questioned about their competence regarding storage practices. It's inevitable and unavoidable. Rather than resisting it, they should take it as an opportunity to proclaim about how much they are concerned on this topic and describe at length the extensive, redundant, and highly structured processes they have in place to avoid issues like this one. This information won't stop people from querying the provider, but it shows responsiveness and provides the opportunity to pick up share.

Long-term, this is a minor bump in the road: Of course this is a significant incident, and of course a very difficult situation for those affected by it, but in the long-run, this will be looked back at as a minor incident. Cloud computing is gaining momentum, driven by an appreciation of its strengths and cost efficiencies, and a problem, even one as serious as this, will not long hinder its progress.

Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date.

Follow Bernard Golden on Twitter @bernardgolden. Follow everything from CIO.com on Twitter @CIOonline

More from IT Drilldown « Back to Virtualization
CASE STUDY
Disaster Can Inspire Quick Move to Desktop Virtualization
In the wake of a hurricane, a Texas hospital system's IT group overcame user reluctance to virtualize desktop PCs. Here's a look at their journey and the thorny little issue that Citrix just solved a few weeks ago: USB port support. Full Story »

Loading...
Virtualization Vendor Matrix

Find out what vendors offer the products you need.

View the Vendor Matrix »
Virtualization ABCs

Get up to speed on virtualization.

Learn More »
Virtualization MarketSpace
 
SPONSORED LINKS
 

Removing Barriers To Better Server Virtualization Efficiency

Global Research: CIOs Weigh In On Virtualization

5 Key Virtualization Management Challenges

Upgrading to VMware vSphere with vWire

Maximizing website Return on Information with high-quality search

See how AT&T can help protect your network.

Webcast: Unleashing the Power of Customer Data

White Paper: Improve Agility with Operational Responsiveness

White Paper: Legacy Tools: Not Built for the Helpdesk

Secure Email and Web-Based Communication from Evolving Attacks

WagerWorks Takes Fraudsters Out of the Game using iovation

Seven Design Requirements for Web 2.0 Threat Protection

Increase UPS efficiency without sacrificing protection.

Learn how advanced forecasting tools can deliver significant business results for global corporations.

Lower IT Costs with Oracle Database 11g Release 2

White Paper: Visibility and the New Normal of Mobile Work

Taking the Service Desk to the Next Level

Learn about The Information Technology Infrastructure Library.

Return on Information: Google Enterprise Search pays you back. Get the facts.

VMware. The source for Business Infrastructure Virtualization.

ShoreTel tells businesses to untangle from competitors' complexity and turn to its brilliantly simple UC solution

Top Five CIO Challenges

Read the RSA report: Security for Business Innovation

64-page prescriptive guide to security, compliance, and IT operations.

A Clear View Toward Virtualization

White Paper: Right-Sizing Your Power Infrastructure

Taking a Seat at the Executive Table: The Reality of Virtualization

Server Consolidation: Leveraging the Benefits of Virtualization

Return on Information: Google Enterprise Search pays you back

Cut Costs & Green Your IT Operations with PC Power Management

White Paper: 4 Customer Service Myths

White Paper: Managed Security for a Not-So-Secure World

White Paper: 5 Best Practices for Smartphone Support

White Paper: Next Generation Remote Infrastructure Management

Keeping Your Members Safe from Online Scams and Predators

The Total Economic Impact of Network Security Intrusion Prevention

Generation Remote Infrastructure Management - Changing the Paradigm

Cloud-Based Email Management: Opinion Shifts In Favor

eBook: How Can You Make Your People Productive Anywhere?

Achieving Business Agility with Application Grid

Ready to virtualize tier one applications? Check your virtualization maturity.

Seven Ways ITIL Can Help You in an Economic Downturn

Tips for successful virtualization management.

AT&T Synaptic Storage as a Service. Expand on demand

Trend Micro ranked #1 against real-world malware. Read more.

Webinar: Jump-start your in-house e-discovery with Ringtail QuickCull from FTI Technology

Streamline IT Costs. Boost Performance with WAN Optimization.

Build your 1st app FREE with Force.com

TDWI checklist helps define data readiness for analytics. Download report.

eZine: A Roadmap to Reducing IT Complexity

 
 
RESOURCE CENTER