Home · All Classes · Main Classes · Grouped Classes · Modules · Functions

Phonon Overview

Introduction

Multimedia is media that contain multiple forms of information content, e.g., audio, text, animation, video, and interactivity. With the Qt multimedia framework, Phonon, Qt provides functionality to playback and manipulate the most common multimedia formats, also from physical storage media, e.g., CD and DVD. Many formats, such as mpeg and avi video files, can be played back.

In this overview, we take a look at the main concepts of Phonon. We also explain each component of the architecture, examine the core API classes, and show examples on how to use the classes provided. The Phonon API documentation can be found in KDE's API reference: http://api.kde.org/4.0-api/kdelibs-apidocs/phonon/html/index.html.

Architecture

Phonon is a graph based framework. The nodes in the graph are simple objects; they inherit from QObject, take input, and give output. The input and output are data streams with multimedia content. The connections over which streams travel between nodes are called paths.

Phonon has tree basic concepts: media objects, sinks, and paths. The media objects manages a media source, which, for instance, can be a file, DVD, or URL; it might be thought of as a minimal media player that outputs one or more streams that can be used by other nodes in the network. A sink is an output device, which can play back the media, for instance video on a widget using a graphics card, or sound using a sound card.

Both the media object and sinks are nodes in the graph. All nodes in the graph take a media stream as input, process the stream, and possibly output it to another node in the graph. Below is an image of such a media graph for an audio stream.

Audio playback is started and managed by the media object, which send the media stream to any sinks connected to it by a path. The sink then plays the stream back, usually though a sound card.

All nodes in the graph are synchronized by the framework, meaning that if more that one sink is connected to the same media object, the framework will handle the synchronization between the sinks; this happens for instance when a media source containing video with sound is played back. More on this later.

Media Objects

A media object, an instance of the MediaObject class, is an interface for media playback. The object will be a source object in the graph, providing other nodes with the media.

The media object itself knows how to playback the media, and inform other nodes of the media type of its source. The playback can be controlled by the user. For a video playback, for instance, a media object can start, stop, fast forward, and rewind, i.e., control the state of the playback. You may think of the object as a simple media player. It can also queue media for playback. Manipulation of the media is left for other nodes, e.g., sinks.

The media data is provided by a media source, which is encapsulated by the media object. The media source itself is a separate object, an instance of MediaSource, in Phonon, and not part of the graph itself. The source will supply the media object with a raw data. The source can fetch its data from files, URLs, DVD/CD, and also any QIODevice. The contents of the source will be interpreted by the media object.

A media object is always instantiated with the default constructor and then supplied with a media source. Concrete code examples are given later in this overview.

As a complement to the media object, Phonon also provides MediaController, which provides control over features that are optional for a given media. For instance, for DVD playback, chapter, menus, and title will be features managed by a MediaController.

Sinks

A sink is a node that can output media, and is an output node in the graph, i.e., it does not send its output to other nodes. A sink is usually a rendering device.

The input of sinks in a Phonon media graph comes from a MediaObject, though it might have been processed through other nodes on the way.

While the MediaObject controls the playback, the sink has basic controls for manipulation of the media. With a audio sink, for instance, you can control the volume and mute the sound, i.e., it represents a virtual audio device. Another example is the VideoWidget, which can render video on a QWidget and alter the brightness, hue, and scaling of the video.

As an example we give an image of a graph used for playing back a video file with sound.

Processors

Phonon does not allow manipulation of media streams directly, i.e., one cannot alter a media streams bytes programmatically once they have been given to a media object. However, a back end can offer processors, which are nodes placed in the graph on the path somewhere between the media object and its sinks. In Phonon, processors are subclasses of the Effect class.

The processor is inserted into the rendering process, altering the media stream, and will be active as long as it is part of the graph. To stop, it needs to be removed.

The Effects may also have controls that allow control over how the media stream is manipulated. A processor applying a depth effect to audio, for instance, can have a value controlling the amount of depth. An Effect can be configured at any point in time.

Playback

In some common cases, it is not necessary to build a graph yourself to playback multimedia.

Phonon has convenience functions for building common graphs. For playing an audio file, you can use the createPlayer() function. This will set up the necessary graph and return the media object node; the sound can then simply be started by calling its play() function. The code example below shows how trivial this is:

 using namespace Phonon;

 MediaObject *music =
     createPlayer(Phonon::MusicCategory, "/path/mysong.wav");
 music->play();

Notice that we use the namespace Phonon, of which createPlayer() is a member.

We have a similar solution for playing video files, the VideoPlayer. The coding necessary is also trivial:

 VideoPlayer *player =
     new VideoPlayer(Phonon::VideoCategory, parentWidget);
 player->play(url);

The VideoPlayer is a widget onto which the video will be drawn.

The .pro file for a project needs adding of the following phonon specific line:

 QT += phonon

Phonon comes with several widgets that provide functionality commonly associated with multimedia players - notably SeekSlider for controlling the position of the stream, VolumeSlider for controlling sound volume, and EffectWidget for controlling the parameters of an effect. You can learn about them in the API documentation.

Building Graphs

If you need more freedom than the convenience functions described in the previous sections offer you, you can build the media graphs yourself. We will take a look at how some common graphs built for playback of multimedia. We also give code snippets that create the graphs. Starting a graph up is a matter of calling the play() function of the media object.

If the media source contains several types of media, for instance, a stream with both video and audio, the graph will contain two output nodes: one for the video and one for the audio.

We will now look at the code required to build the graphs discussed previously in the Architecture section. All code examples assumes that the Phonon namespace is used, i.e., code is preceded by:

 using namespace Phonon;

Audio

When playing back audio, you create the media object and connect it to an audio output node - a node that inherits from AbstractAudioOutput. Currently, the only node provided is AudioOutput, which outputs audio to the sound card.

The code to create the graph is straight forward:

 MediaObject *mediaObject = new MediaObject(this);
 mediaObject->setCurrentSource("/mymusic/barbiegirl.wav");
 AudioOutput *audioOutput =
     new AudioOutput(Phonon::MusicCategory, this);
 Path path = createPath(mediaObject, audioOutput);

Notice that the type of media an input source has is resolved by Phonon, so you need not be concerned with this. If a source contains multiple media formats, this is also handled automatically.

The media object is always created using the default constructor since it handles all multimedia formats.

The setting of a Category, Phonon::MusicCategory in this case, does not affect the actual playback; the category can be used by KDE to control the playback through, for instance, the control panel. Users of KDE can often also choose to send sound with the CommunicationCategory, e.g., given to VoIP, to their headset, while sound with MusicCategory is sent to the sound card.

The AudioOutput class outputs the audio media to a sound card, that is, one of the audio devices of the operating system. An audio device can be a sound card or a intermediate technology, such as DirectShow on windows. A default device will be chosen if one is not set with setOutputDevice().

The AudioOutput node will work with all audio formats supported by the back end, so you don't need to know what format a specific media source has.

For a an extensive example of audio playback, see the Phonon Music Player.

Audio Effects

Since a media stream cannot be manipulated directly, the backend can produce nodes that can process the media streams. These nodes are inserted into the graph between a media object and an output node.

Nodes that process media streams inherit from the Effect class. The effects available depends on the underlying system. Most of these effects will be supported by Phonon. See the Querying Backends for Support section for information on how to resolve the available effects on a particular system.

We will now continue the example from above using the Path variable path to add an effect. The code is again trivial:

 Effect *effect = new Effect(availableAudioEffects()[0], this);
 path.insertEffect(effect);

Here we simply take the first available effect on the system.

The effect will start immediately after being inserted into the graph if the media object is playing. To stop it, you have to detach it again using removeEffect() of the Path.

Video

For playing video, VideoWidget is provided. This class functions both as a node in the graph and as a widget upon which it draws the video stream. The widget will automatically chose an available device for playing the video, which is usually a technology between the Qt application and the graphics card, such as DirectShow on Windows.

The video widget does not play the audio (if any) in the media stream. If you want to play the audio as well, you will need an AudioOutput node. You create and connect it to the graph as shown in the previous section.

The code for creating this graph is given below, after which one can play the video with play().

 MediaObject *media = new MediaObject(this);

 VideoWidget *videoWidget = new VideoWidget(this);
 createPath(media, videoWidget);

 AudioOutput *audioOutput =
     new AudioOutput(Phonon::VideoCategory, this);
 createPath(mediaObject, audioOutput);

The VideoWidget does not need to be set to a Category, it is automatically classified to VideoCategory, we only need to assure that the audio is also classified in the same category.

The media object will split files with different media content into separate streams before sending them off to other nodes in the graph. It is the media object that determines the type of content appropriate for nodes that connect to it.

Backends

The multimedia functionality is not implemented by Phonon itself, but by a back end - often also referred to as an engine. This includes connecting to, managing, and driving the underlying hardware or intermediate technology. For the programmer, this implies that the media nodes, e.g., media objects, processors, and sinks, are produced by the back end. Also, it is responsible for building the graph, i.e., connecting the nodes.

Qt provides a back end for each of its platforms. The back ends in turn use the media systems DirectShow (which requires DirectX) on Windows, QuickTime on Mac, and GStreamer on Linux. The functionality provided on the different platforms are dependent on these underlying systems and may vary somewhat, e.g., in the media formats supported.

Backends exposes information about the underlying system. It can tell which media formats are supported, e.g., AVI, mp3, or OGG. It will also inform what kind of nodes it can produce, and how they can be connected to each other. This also includes a textual description suitable for users of a Phonon application.

A user can often add support for new formats and filters to the underlying system, by, for instance, installing the DivX codex. We can therefore not give an exact overview of which formats are available in the Qt backends.

Qt backends also handles threads and networking, so the programmer need not be concerned with this. Streaming over networks is also implemented.

Querying Backends for Support

As mentioned, Phonon depends on the backend to provide its functionality. Depending on the individual backend, full support of the API may not be in place. Applications therefore need to check with the backend if functionality they require is implemented. In this section, we take look at how this is done.

The backend provides the availableMimeTypes() and isMimeTypeAvailable() functions to query which MIME types the backend can produce nodes for. The types are listed as strings, which for any type is equal for any backend or platform.

The backend will emit a signal - Notifier::capabilitiesChanged() - if it's support have changed, e.g., if a USB headset is plugged into the system.

To query the actual audio devices possible, we have the availableAudioDevices() as mentioned in the Sinks section. To query information about the individual devices, you can examine its name(); this string is dependent on the operating system, and the Qt backends does not analyze the devices further.

The sink for playback of video does not have a selection of devices. For convenience, the WideoWidget is both a node in the graph and a widget on which the video output is rendered. To query the various video formats available, use isMimeTypeAvailable(). To add it to a path, you can use the Phonon::createPath() as usual. After creating a media object, it is also possible to call its hasVideo() function.

See also the Capabilities Example.

Installing Phonon

Windows

On Windows, Phonon requires DirectX version 9 or higher.

For Phonon application development, the platform and DirectX SDKs must be installed. The name of the platform SDK may very between Windows versions; on Vista, it is called Windows SDK.

Note that these SDKs must be placed before your compiler in the include path; though, this should be auto-detected and handled by Windows.

Linux

The Qt backend on Linux uses GStreamer (minimum version is 0.10), which must be installed on the system. It is a good idea to install every package available for GStreamer to get support for as many MIME types, and audio effects as possible. At a minimum, you need the GStreamer library and base plugins, which provides support for .ogg files. The package names may vary between Linux distributions; on Mandriva, they have the following names:

PackageDescription
libgstreamer0.10_0.10The GStreamer base library.
libgstreamer0.10_0.10-develContains files for developing applications with GStreamer.
libgstreamer-plugins-base0.10Contains the basic plugins for audio and video playback, and will enable support for ogg files.
libgstreamer-plugins-base0.10-develMakes it possible to develop applications using the base plugins.

Mac

On the Mac, Qt uses QuickTime for its backend. The minimum supported version is 7.0.

Work in Progress

Phonon and its Qt backends is, albeit fully functional for multimedia playback is still under development. Functionality to come is the possibility to capture media and more processors for both music and video files.

Another important consideration is to implement support for storing media to files, i.e., not playing back media directly.

We also hope in the future to be able to support direct manipulation of media streams. This will give the programmer more freedom to manipulate streams than just through processors.

Currently, the multimedia framework supports one input source, e.g,. a file or a CD. It will be possible to include several sources. This is useful in, for example, audio mixer applications where several audio sources can be sent, processed and output as a single audio stream.


Copyright © 2007 Trolltech Trademarks
Qt 4.4.0-tp1