Craig Mundie, 59, is Microsoft Corp.'s chief research and strategy officer. He assumed his position as chief visionary in June 2008 after Bill Gates retired from day-to-day operations at the company.
In his current role, Mundie is responsible for the long-term strategic direction of the business and recently completed a technology tour of U.S. colleges. Microsoft Research has 800 researchers in six locations worldwide. The company plans to spend more than $9 billion on R&D this year, up from $8.2 billion in 2008.
Microsoft announced some belt-tightening recently. How will that affect the R&D budget and your priorities? Will you scale back on basic research, as some competitors have? No. We have an opposite view on that, which is the tighter the economic times, the more the focus has to be on maintaining your R&D investment broadly.
You want the normal cycle to produce timely results, but we've always believed that the pure research component was critical to us for several reasons. It gives us the ability to continue to enhance the businesses we're in. It gives us the ability to disrupt certain industries that we choose to enter [and] it's a shock absorber that allows us to deal with the arrival of the unknown from competitive actions or other technology breakthroughs. In these uncertain times, all three of those things are important to us.
So, no substantive changes despite the economy and staff cutbacks? Yes. We did our first cross-company layoff in January -- about 1,400 people -- and we'll continue to make adjustments up to a total of 5,000 people in the course of the next 18 months. But a lot of this is not strictly cost restructuring but a resource reallocation mechanism in order to fund the things that we think we need to grow.
We believe that the economy is heading for a reset, not something from which there's going to be a nice, quick, happy rebound. We've taken our cost structure down to a level that we think is sustainable against that kind of economic outlook.
There are several different ways we seek to insert innovation into the products. First, we develop new features for the products we have. Second, we create new products and put them alongside the products we already have.
For example, we created OneNote because we thought there was going to be a new requirement for this type of more preformed "notebook/handwriting/aggregate everything" type of capability that didn't fit naturally within the mission of Word, PowerPoint and the other products that are part of the Office suite. So we put a new product in there. That didn't create any new compatibility problems but produced a whole new capability.
As we look in the platform and tools areas toward the future, we expect that there's a lot of change coming in the underlying architectures -- for example, like virtualization and the fact that there will be many CPU elements that actually allow some of the side-by-side execution of things. That can provide perfect compatibility while still allowing the introduction of things that represent whole new capabilities.
Have you ever been disappointed that a great technology didn't take off as well you expected? One of the biggest areas I championed in my early years here in the early 1990s was interactive broadband television. We had many of the ideas, perfected a lot of the technology in the '90s, and we're sitting here in 2009 just starting to see a significant global ramp-up of that as a successor to traditional television.
It's certainly been a disappointment to me that the collective things necessary to make that happen -- for example, broadband network penetration and performance and things like that -- have lagged so far behind globally, and especially in the United States. Things that are very interesting technologically for the user are just not being brought to market at the speed I would have hoped for.
But users are consuming more television programming over the Web these days. Are we really that far away from your vision? If you're looking at the big-screen experience, is it actually being delivered over a packet-switched network with a basis of interactivity and two-way communication? That's what I call full-tilt-boogie IP TV, and it's still ultimately the solution that people will come to use.
Given that that largely hasn't become available yet, and that we have so many people growing up with a lot of comfort in the use of PCs as media playback devices, we're starting to see people seeking that kind of network entertainment experience, delivered on demand and over the network. The business model there tends to be more ad-supported than subscription-based. For some people, that does represent access to content that historically they had to pay a [cable TV] network subscription fee for, and they're now able to get back into the ad-supported model.
In a way, it's a bit like the new world surrogate for over-the-air television. All the major networks in the United States are free over the air all the time, yet very few Americans watch them because the experience isn't very good, the shift to digital has not taken place, and there is limited content availability.
None of those [problems] exist in Internet access to that media. There's no limit on shelf space, you can have as much differentiation as you want, there's no rigorous timetable that you have to watch in prime time. So many of the things people covet in their entertainment experience, they can get today in what people call "over the top" or over the IP network access to the media. I think of it as the contemporary surrogate for free-to-air television.
What is your proudest R&D achievement? I did a lot of the early work broadly in non-PC-based computing. All of the things that we have today that have matured into our game console business, our cell phone business and our Windows CE-based, Windows Embedded and Windows Mobile capabilities all started in the groups I formed here between 1992 and 1998.
I look at the progress we've made there and I take quite a bit of pride in the fact that we anticipated those things and we were able to get into so many of them. The company's strategy all along was to recognize that, ultimately, people would have many smart devices, and we wanted to be the company that would have some cohesive way of dealing with all of them. There's still work to be done, but nobody else has invested to have a position in so many of the devices that are now important to people.
You talk about technology waves. What will be the next big wave? What happens in waves is the shift from one generation of computing platform to the next. That platform gets established by a small number of killer apps. We've been through a number of these major platform shifts, from the mainframe to the minicomputer to the personal computer to adding the Internet as an adjunct platform. We're now trending to the next big platform, which I call "the client plus the cloud."
That's one thing, not two things. Today, we've got a broadening out of what people call the client. My 16 years here was in large measure about that. And then we introduced the network. The Internet was a place where you had Web content and Web publishing, but other than being delivered on some of those clients, the two things were somewhat divorced.
The next thing that will emerge is an architecture that allows the application developer to think of the cloud plus the client architecturally as a single thing. In a sense, it is like client/sever computing in the enterprise. It was the homogeneity that existed between some of the facilities at the server and the client end that allowed people to build those applications. We've never had that kind of architectural homogeneity in this cloud-plus-client or Internet-plus-smart-devices world, and I'm predicting that will be the next big thing.
What the world is searching for now is the right combination of underlying technologies and some killer apps that will demonstrate that the capabilities of this integrated end-to-end view of the cloud-plus-client will enable things that the world hasn't seen yet. That's what we're focused on here.
So, what technologies will drive this? The technologies come at this at two levels. What are the underlying shifts in the lower-level platform technologies that will allow that to happen? And what are the things that might change the user's experience in some fundamental way?
There are two big things that form the nucleus of those two big changes. The microprocessor itself is going to change to this heterogeneous, many-core capability over the next four or five years. We've been planning for it, we know it is coming, it's sort of on the rails, and yet most of the world hasn't come to grips with the implications of that in terms of the application model and programming tools. To get performance, you're going to have to write parallel applications, and if it's cloud-plus-client, you're going to have to write distributed parallel applications. Those have historically been viewed as hard problems, but they will have to become de rigueur in the future.
The second thing is that the technologies of man-machine interaction are evolving and will be aided by the quantum change in computational capabilities that for the first time client devices will be able to implement natural, more humanistic ways of dealing with people. We call that next era that natural user interface.
Think of it as the successor to the graphical user interface. Microsoft was the company that drove the broad adoption of the GUI by putting Word and Excel on the early version of Windows. That became the killer app that brought us personal computing. Now we can see the outline of the NUI, just as we could see the outline of Windows coming. And yet you have to figure out, what are the killer apps?
And what will those killer apps be? We're working on some, [but] they're very hard to predict. You can't really gin up a killer app on demand. A certain serendipitous process has to take place for those things to emerge. There's an invention part of that, there's a technological part of that, there's a market readiness part of that, and none of those things are completely controllable. But that kind of [event] comes around every 15 years or so in our industry, and we're getting into that time zone. That's why we think it's going to happen.
During presentations, you've show Laura, a 3-D avatar you call the robotic receptionist. How does this tie into this wave? We can take the new technologies of robotics, which are designed for high-scale, highly concurrent, distributed application development, and use them as a vehicle to compose together many of the individual advanced technologies like speech synthesis, speech recognition, human-feature-based modeling, machine vision and machine learning. Is there a way to compose these things together such that the whole really is greater than the sum of its parts?
What Laura showed was that we're at the bleeding edge of being able to bring these together in such a way that there is a qualitative change in the way you can interact with a computer system. It really does become more like dealing [with the computer] on a person-to-person basis in a free-form way. The computer will move from being strictly a reactive tool in your hand to being a proactive partner in trying to solve problems. It's that change in the qualitative experience, by which the computer helps you get stuff done or does stuff for you, that I think will be the hallmark of this next era.
This is the default player used to display virally syndicated titles via the Get the Code button. http://link.brightcove.com/services/link/bcpid1351827287 http://www.brightcove.com/channel.jsp?channel=1351824782
Microsoft Research's robotic receptionist demonstrates how multiple technologies can be integrated to create a new way in which users can interact with computers.
How has Laura evolved since you took the demo on the road last year? We've been broadening out Laura, teaching her some other domains to learn more about how she interacts with people. We taught her recently how to play trivia [games] with people. All of these things are ways of finding out how people react to dealing with a lifelike avatar that really does interact with them just like a person.
What are the challenges Laura faces before she can work in the lobby? The demo consumes an eight-core machine pretty much fully when it's interacting [with people]. Yet each element of it -- the vision system, the speech system, the reasoning system -- is running at a fairly coarse granularity.
But if you give us more horsepower, it will just get better. This is a precursor to a new class of applications that have an almost unlimited appetite for computational capability. That's a very different situation than we find ourselves in with most applications today. They barely utilize the capability of the machines we have.
As we get more horsepower, Laura's performance will be better in every dimension. The quality of her speech will be better, we'll be able to move beyond a rough polygonal model of her face and features and the animation of her face and movement.