Microsoft DirectShow is an extensible, filter-based framework, based on the Microsoft Windows Component Object Model (COM), that provides a common interface for media across many of Microsoft's programming languages. It can render or record media files on-demand by the user or developer. DirectShow® also contains DirectX plug-ins for audio signal processing, and DirectX Video Acceleration for accelerated video playback.
Microsoft produced the DirectShow multimedia framework and API, which replaced the Video for Windows technology (VFW), to enable software developers to perform various operations on media files. DirectShow development tools and documentation are distributed as part of the Microsoft Platform SDK.
Most video-related Windows applications, such as Microsoft's Windows Media Player, Winamp, and Windows Movie Maker, use DirectShow to manage multimedia content. DirectShow's most notable competitor is Apple Computer's QuickTime framework.
DirectShow divides multimedia task processing such as video playback into a set of steps. Each step, or stage in the processing of the data, is called a filter. Filters are connected together by input and output pins. Filters can be connected in different ways for different tasks to build a filter graph that lists all necessary filters to perform a specific task. Developers can add custom effects or other filters at any stage in the graph, then render the results to a file, URL, or camera.
DirectShow Base Classes, a set of C++ classes provided in the DirectShow SDK, are used to build most filters. These handle much of the creation, registration, and connection logic for the filter.
Since the entire concept of rendering, converting, and capturing files in DirectShow is based on filters and filter graphs, it is important to understand the role of each filter.
Source filter. This is usually the first filter in the graph. It is responsible for reading the input data. The data can come from a file on disk, a network, or any other method.
Demulitplexer. This filter is responsible for splitting the media streams. It is usually connected to the source filter. For example, the filter input might be the actual file or network stream, while the output would be separate audio and video streams.
Video/Audio Decoder . These filters handle the actual decoding or decompression. They do not demultiplex, so data should be demultiplexed before it is passed to the decoder. Therefore, they are usually connected to the demultiplexer output. For example, the video decoder input might be a compressed video stream such as MPEG2, and the output could be raw video data.
Renderer. These filters are used to actually render data. Data could be audio, video, or both. For example, when playing a media file with both audio and video, a video renderer would handle displaying the video on the screen, and an audio renderer would handle directing the audio data to the sound device. The input of the renderer is usually uncompressed data coming from the decoder.
Audio/Video Encoder . These filters are used to compress data, audio or video. The input is usually uncompressed audio or video data, and the output is the compressed version of the same data.
Multiplexer This filter is responsible for joining media streams. Input is usually compressed data from an audio/video encoder . The output is a single stream containing both video and audio data.
Sink Filter. These filters are usually the last filters in the graph. They can handle writing the data to disk to create a media file, or they can send the data to some other location, such as over a network.
Video/Audio Processor. These are usually custom filters used to perform some type of data processing or generate some type of event. LEAD has created many videos and audio processors, such as the Video Resize Filter, used to resize a video stream . Usually, these filters only handle uncompressed data, so they would be inserted in the filter graph before the encoder or after the decoder.
DirectShow filter graphs are widely used in video playback, in which the filters provide steps such as file parsing, video and audio de-multiplexing, decompressing, and rendering. They are also used for video and audio recording and editing, and for interactive tasks such as DVD navigation. During rendering, the filter graph searches the Windows Registry for registered filters, builds the graph, connects the filters together, and, at the developer's request, plays, pauses, etc. based on the created graph. GraphEdit, a free utility that ships with the DirectShow SDK, can be used to build and test custom graphs, filter by filter.
Each filter in a filter graph handles a specific task, and each filter is usually designed to handle a specific type of data or stream.
For example, to create an MPEG2 file, you would need an MPEG2 Encoder and an MPEG2 Multiplexer . Most likely, the MPEG2 Encoder will only create MPEG2 compressed data and the MPEG2 Multiplexer will only accept MPEG2 video and certain types of audio related to MPEG2 as inputs. The same goes for decoding and demultiplexing. An MPEG2 Decoder will only decode MPEG2 video, and an MPEG2 Demultiplexer will only accept as inputs a stream containing MPEG2 video and certain types of audio related to MPEG2.
If you try to connect filters that do not agree on data types, the connection is usually refused and the graph will not run. This is why it is important to know what media type each filter supports.
For example, the following table lists media types supported by the LEAD MPEG2 Multiplexer . Attempting to connect any media type other than those listed to the input of the LEAD MPEG2 Multiplexer results in the connection being refused.
Table 1. Media Types Supported by the LEAD MPEG2 Multiplexer
The following figures illustrate basic filter graphs for capture, conversion, and playback:
Figure 1. A Simple Capture Graph
Figure 2. A Complex Capture Graph
Figure 3. A Simple Conversion Graph
Figure 4. A Simple Playback Graph
For the filter graph to use filters automatically, the filters need to be registered in a separate DirectShow registry entry, as well as being registered with COM. (However, if the application adds the filters manually, they do not need to be registered at all.)
DirectShow uses a "merit system" to determine which filter to use to handle a specific task. The "merit" of each filter is determined by a value stored in the registry—and that value is determined by the creator of the filter. The actual properties or quality of a filter have nothing to do with its merit. For example, to decompress MPEG2 data, you need an MPEG2 decoder. If multiple MPEG2 decoders exist on the machine, the decoder with the highest value in the registry will be used.
Many companies now develop codecs in the form of DirectShow filters, resulting in the presence of several filters that can decode the same media type. " Codec hell" ensues when multiple DirectShow filters, all used for encoding or decoding the same media type, exist on a given computer. Under the merit system, implementations often compete with one another by registering themselves with increasingly elevated priority.
Thus, although DirectShow is capable of dynamically building a graph to render a given media type, it becomes difficult for developers to rely on this functionality when the resulting filter graph is variable. It is possible that filter graphs will change over time, as new filters are introduced to the computer. This can result in a support nightmare for developers and businesses. Developers often resort to manually building filter graphs to be certain of their contents, crippling one of DirectShow's more appealing features.
DirectShow has many interfaces for capturing from and controlling many types of webcams, TV tuners, and other devices that have a DirectShow driver. Options include controlling the TV tuner, and setting many common device properties such as capture size, color space, and frame rate.
However, actual control over the device is limited by what the manufacturer of that device has exposed in the DirectShow driver, and what has been implemented in the device itself. For example, one device may support capturing at 3 different resolutions, while another supports 10. One device may allow you to change the frame rate, while another may not.
The problem with DirectShow is that its greatest strength is also its greatest weakness. Its greatest strength is its ability to "get under the hood" to connect filters programmatically, create custom filters, etc. But that flexibility has a cost: complexity.
LEAD's solution—the LEADTOOLS Multimedia SDK—handles the complexity "under the hood", but exposes Multimedia functionality to the developer through dozens of easy-to-use interfaces.
Three of these interfaces handle the most common tasks: ltmmCapture, ltmmPlay and ltmmConvert. They simplify the process of using DirectShow to connect the correct filters in the correct order, enumerate available capture devices for capture, navigate DVDs, perform media playback, and many other tasks.
By default, DirectShow supports several common media file formats, such as MPEG1 (decoding only), MP3, Windows Media Video, and plain static images. But DirectShow is also completely extensible, and extensions allow it to support any container format available—including any audio or video codec.
LEAD has created dozens of these extensions, which are included with the LEADTOOLS Multimedia SDK. A current list of available encoders, decoders, multiplexers, demultiplexers, sink/source, and other filters is at LEADTOOLS DirectShow Filters.
DirectShow filters provide a modular solution for dealing with different multimedia tasks, but can pose a problem if you do not know which filters are being used in your graph. Most issues can be resolved through examining the graph currently in use. There are several useful tools to help with this.
EditGraph Method. The "EditGraph" method, provided in all three major interfaces (Capture, Control, and Play) in the LEADTOOLS Multimedia SDK, allows you to view the current graph using GraphEdit, a free utility that ships with the DirectShow SDK. After GraphEdit is called, a message box appears bearing notification that the current graph has been registered. At this point, open GraphEdit and select the "Connect To Remote Graph" option. Your graph will be listed as an available graph. Select it, and the visible graph will be built in GraphEdit.
Example 1: Debugging Using GraphEdit. An MPEG2 file plays on one machine but does not play on another machine. Probably, there are different filters in use on the different machines. Run GraphEdit on each machine, and compare the filter graphs. Most likely you will see a difference in the filters being used on the two machines. If a non-LEAD decoder is being used which is not handling the data properly, unregister that decoder, so that the LEAD decoder will be used and the file will render correctly.
AMCAP. AMCAP is a free capture utility that ships with the Microsoft DirectX SDK. It is not a LEAD product, but uses the same DirectX code that LEAD uses to capture from capture devices. For this reason, it is a good application to test devices that appear to function incorrectly with LEAD.
Example 2: Debugging Using AMCAP. Using the LEAD Capture Control, the preview image for a device is all black. Build a preview image for the same device using AMCAP. If the preview is also all black in AMCAP, the problem is most likely the device driver. In that case, obtain an update from the manufacturer (if available). However, if the same problem does not occur using AMCAP, contact the LEADTOOLS Support Department with details of the issue and the make/model of the device.
Also known as a decompressor, this is a module or algorithm to decompress data.
A multimedia framework and API produced by Microsoft for software developers to perform various operations with media files. Most Windows video-related applications on Windows, such as Microsoft's Windows Media Player, use DirectShow to manage multimedia content.
Also known as compressor, this is a module or algorithm to compress data. Playing that data back requires a decompressor, or decoder .
A module that combines audio and video into one file.
The portion of the file holding the video data. The video data might be compressed to save disk space. The data has to be decompressed using a video decompressor ( Decoder ) before you can play it.