Open source identity: PulseAudio creator Lennart Poettering

He likes photography and skiing, but the primary concern of Lennart Poettering is advancing the Linux audio experience with PulseAudio, an open source sound server.

PulseAudio’s impressive set of features include per-application volume controls, a modular architecture, support for multiple audio sources and sinks, the ability to discover other computers using PulseAudio on the local network and play sound, as well as change which output device an application plays sound through -- while the application is playing sound.

It's pretty obvious that the complaints and criticisms about PulseAudio you can hear in some forums are not really shared by the vast majority of technical peopleLennart Poettering, creator of PulseAudio

Thanks to PulseAudio, the Linux audio experience will become more context-aware. For example, if a video is running in one application the system should automatically reduce the volume of everything else and increase it when the video is finished.

Previously, the Open Source Identity series has featured interviews with Ruby on Rails creator David Heinemeier Hansson, Linux’s Linus Torvalds, Jan Schneider of Horde, Mark Spencer of Asterisk fame, Spine CMS creator Hendrick van Belleghem, and Free Telephony Project founder David Rowe. This time we catch up with Lennart immediately after this year’s Linux Plumber’s Conference to find out the latest PulseAudio (PA) developments.

What are some of recent developments with PulseAudio? How are you responding to criticism over the role of PulseAudio?

I am not too concerned about most of the criticism and flames that erupt from time to time on various channels. All the big Linux distributions have adopted PulseAudio and it is an integral part of both the Palm Pre and the Nokia N900 devices, as well as Intel's Moblin.

That basically means that PulseAudio has been adopted by about everyone who could adopt it. There is not really anyone who doesn't do PulseAudio anymore.

Acknowledging that simple fact makes it pretty obvious that the complaints and criticisms about PulseAudio you can hear in some forums are not really shared by the vast majority of the technical people -- quite the contrary.

So, where do they come from? Usually from users who are encountering problems when running PA in conjunction with particular hardware drivers, or higher-level software.

While PA itself is certainly not bug-free (no software is) the majority of issues were triggered by misbehaving drivers or by misbehaving applications.

More specifically some applications were still using audio APIs [OSS] that are almost impossible to virtualize. And also PulseAudio makes use of a lot of driver functionality that was previously unused and hence little tested.

In fact, for quite a few parts of the lower level ALSA APIs PulseAudio is the first user of all. And of course, it cannot be a surprise that we expose bugs that were previously unknown in the drivers this way.

It's not my intention to shift the blame around though. PA and the other layers of our stack should not be viewed as independent parts. If PA uses a new or previously unused feature of the drivers then we need to fix the drivers at the same time.

If we make PA expect more correct behaviour from the apps, or that applications stop making particular assumptions about the audio stack, we need to fix the applications at the same time.

And thanks to the fact that this is all free software doing that is actually possible. And we tried to do that in the past and are getting better at it.

One should never forget what we are doing here. We took an audio system that followed the low-level design that was current in the early '90s and brought it in one big step to what is current today.

We inserted an entire new layer into our stack right in the middle, so that we can catch up with the more advanced audio stack that Mac OS X or Windows provide right now. Doing something like this, of course, will trigger problems at many places. Criticism hence must be expected.

Also, I get a lot of personal e-mails with feedback on PulseAudio and, despite what some people might think, the positive comments actually outnumber the negative comments by far.

Page Break

In addition to drivers, there is the end user experience of sound and video. What’s happening at this year’s Linux Plumbers Conference (LPC)? Have these are featured heavily in your discussions?

Mostly our discussions focused on how to handle the challenges that embedded and mobile use of Linux adds to our audio stack. The departure from strictly PC-style sound hardware results in more complex and dynamic routing and control, as well as general architecture changes to minimize power usage.

The departure from strictly PC-style sound hardware results in more complex and dynamic routing and control, as well as general architecture changes to minimize power usage.

Usually minimizing power usage comes at the price of higher latency. So to save power overall we need to dynamically adjust to what the currently active audio applications require, so that we can provide for both low-latency and high-latency applications at the same time without burning any more power than we really have to.

Another big topic was how to improve the ability to auto-discover the features and capabilities of sound cards, as well as choosing good and automatic defaults for them -- all for making things work better out-of-the-box.

Will Linux ever get a “it just works” sound and video stack that can handle any type of media without the need for user configuration? What’s the direction here?

Of course, making things "just work" is our definite goal with all the work we do for the Linux desktop. Since the last LPC we have seen quite a few changes on the various building blocks of our audio infrastructure and they are now neatly falling into place.

More specifically we tackled a couple of issues since then. For example, the volume control should now be initialized at a sane level out-of-the-box -- no more fiddling with numerous mixer controls until the volume level is good.

We also automatically discover the properties and capabilities of sound cards much better, and there are now nice UIs for switching between configurations such as HDMI, SPDIF, Analog, Surround and so on -- and all that right during runtime without even having to stop your music.

Because we now use real-time features of Linux by default on desktops, dropouts should be much harder to trigger.

Bluetooth audio has been cleaned up and polished on all levels of our stack too. Just power on your Bluetooth headphones, activate them by clicking on the Bluetooth icon on the desktop panel, and they are ready for use.

We've come a long way, and at various places PulseAudio on Linux already shines. But we still have some way to go. Unfortunately there are a few places which we won't be able to fix at all to “smoothen” the user experience.

For example, due to the patent situation it is unlikely that desktops will support some audio and video encodings out-of-the-box -- which is something that can only be fixed politically, not technically (AC3, MP3 decoding).

Copyright © 2009 IDG Communications, Inc.

Security vs. innovation: IT's trickiest balancing act