Decoding CODECs

Abstract

Summary

A codec is the mechanism by which video and audio signals are compressed to conserve bandwidth before transmission across a telecommunication network. It may be implemented either in hardware or software. In the past, manufacturers sometimes used proprietary codecs that were incompatible with those from other manufacturers. As network usage became more widespread, however, it became clear that standards were needed to permit interoperability between equipment from different manufacturers. Regardless of the coding standards employed, the primary concerns for the user of a codec system are the quality of the audio and video transmitted. How does one decide what kind of codec to use? For telehealth applications, good quality is important, so it is best to look at products from established manufacturers. Ultimately, the choice of a codec for telehealth depends primarily on the trade-offs between quality, cost, bandwidth requirements and interoperability required by the application(s) for the system, and the environment in which it will be used.

Introduction

A codec is the mechanism by which video and audio signals are compressed to conserve bandwidth before transmission across a telecommunication network.¹ Codecs are common where large quantities of information have to be transmitted. If this information was to be transmitted in its raw form, it would use up large amounts of network bandwidth (the capacity of the network to carry the digital signals). Therefore, all codecs support compression and decompression of the digital signals to reduce the size of the signal and make it easier to store and/or transmit the information over the network. The codec at the destination converts the compressed signal from the sender back into audio and video signals for presentation to the user. That is, the signal is decompressed. Since the same device is often used for both compressing outgoing signals and for decompressing incoming signals, it is called a codec (compression–decompression).

The point of using compression is to reduce the network bandwidth needed to transmit the digital signal between the sending and receiving locations. A broadcast quality TV transmission would require 100 Mbit/s or thereabouts; however, in many telehealth environments (especially rural areas), the availability of high-bandwidth networks is limited.

In order to compress the signal, some information must be taken out. The more information is removed, the smaller the resulting files become. When decompressed at the receiving end, this missing information (or ‘loss’) is approximated by the decompression process. As a rule of thumb, the less the original signal was compressed, the better the quality of the restored signal will be.

Codecs can be implemented in either software or hardware. Hardware codecs are faster than their software counterparts, deliver higher quality results and do not require an associated computer. However, they are more expensive. Software codecs, which are largely intended for transmission of information between a simple device (like a DVD player) and a computer, generally take longer to analyse and convert the information. Software codecs are used for transmitting information such as digital camera image files, audio and video files downloaded from the Internet (such as a song or film clip), streaming Internet radio, multichannel, home theatre audio systems, high definition video for movie theatres and PC-based DVD playback applications such as Windows Media Player or an MPEG video codec.² Hardware codecs are primarily employed where conversion speed and quality is important, particularly in applications such as videoconferencing.

How a codec supports networking

Obviously, the devices at each end of a transmission need to agree on the coding method (or algorithm) to ensure that the original information is reproduced correctly at the receiving end. In the past, manufacturers sometimes used proprietary codecs that were incompatible with those from other manufacturers. As network usage became more widespread, however, it became clear that standards were needed to permit interoperability between equipment from different manufacturers. With such standards, any decoder can reproduce information created by any encoder, so long as the encoder conforms to the coding standard.

The development of standards for video and audio transmission is by no means complete, but there are only a few standards now in use, and many codecs have the capacity to understand more than one standard. The International Telecommunication Union (ITU) is a United Nations body based in Switzerland that governs telecommunication standards and has published most of the standards that are used in network communication.³ These standards represent the technical specifications to which a coding scheme must conform in order to ensure its correct decoding.

The ITU's first practical video encoding standard, H.261, was published in 1993 and paved the way for future standardized codec designs.⁴ This standard has been revised over time into the most current standard, H.264, which was published in 2003. Video codecs conforming to this standard are capable of significantly better performance than models using older standards. Standards also exist for other encoding applications, such as data, still images and audio. Some of the audio technologies, like those used in RealPlayer and Windows media players, are associated with particular products, while others conform to the ITU G.711, G.722 and G.728 standards so that they can be used for more general communication purposes.

Many codecs – those used in videoconferencing systems, for example – include devices for both video and audio encoding and thus deal with multiple standards in the same unit. The ITU also has published standards for these multimedia devices; H.323 is the standard for videoconferencing over the Internet, although many systems also support H.320, which allows videoconferencing over narrow-band networks such as ISDN. One of the newest standards is the H.239 standard, which allows video codecs to send two separate video channels within a single video call.⁵ This allows the users to show both data (e.g. PowerPoint slides) and video (e.g. images of the speaker) at the same time. Table 1 summarizes the common ITU telecommunication standards.

Table 1

Common ITU telecommunication standards

ITU standard	Technology	Application	Comment
H.264 (MPEG4)	Video streaming, multimedia, telephony systems	Higher quality video with higher resolution colour	Video compression standard for flexible video across a variety of applications, networks and systems
H.239	Audio, video, data	Simultaneous transmission of multiple video and data streams	Different roles can be assigned to channels to determine what information will be carried and how it is presented to the receiver
H.320	Videoconferencing and videophone	System configuration including communication modes, call control and networking	Supports interoperability of equipment at lower network bandwidths
H.323	Audio, video, data, including codecs	IP-based networks, including the Internet	Standard for multimedia communications
T.120	Multimedia, codec	Multipoint data conferencing	Transfer of data from site to site; structure and coding for data units

Quality versus quantity

Regardless of the coding standards employed, the primary concerns for the user of a codec system are the quality of the audio and video transmitted.⁶ The audio quality is defined by the clarity and consistency of the audio portion of the transmission. In videoconferencing, this includes the synchronization of the audio with the video image (recall the often unintended humour of a foreign language movie badly dubbed into English) as well as noise, echo or inappropriate volume. Video quality is defined by the static picture resolution (number of pixels), the frame rate and of course the accuracy of the reproduction of the original scene by the transmitted image. Lack of video quality in videoconferencing manifests itself as jerky images, blank or blurry segments of the image or long periods of frozen frames. The audio and video quality of a particular codec is dependent on the codec's compression format, as well as its control mechanism, strategy and other characteristics. Display devices can also affect video quality, since they determine how sharp or how bright the images appear to the viewer.

In addition, compression and decompression take time; larger packages of information (high-resolution images or movies, for example) take longer to process than smaller ones such as single pictures and also take longer to transmit over a network. The term for the delay introduced by these and other factors is latency. In applications such as streaming video where timing is not critical, latency is not a particular concern; however, in real-time applications such as videoconferencing or telesurgery, latency becomes a very important matter. There is general agreement that for videoconferencing to be of reasonable quality, the latency in the network should be less than 150 ms.⁷ Most modern videoconferencing units are capable of performing their encoding and decoding within this limit.

Choosing a codec for telehealth

A health-care organization developing or expanding its telehealth capacities will need to ask, among several other questions, which codec is most appropriate for its requirements. The answer will depend on both the needs of the particular telehealth application(s) to which it will be applied and the constraints on the application(s).

In streaming or broadcast applications there is one-way transmission of information to one or more receivers. This allows, for example, delivery of multimedia educational material, or signals from telemedicine peripherals such as digital stethoscopes. Codecs may be used to reduce the network bandwidth required. Two-way transmission, such as videoconferencing, requires more bandwidth because information flows in both directions between the sender and receiver. Codecs are essential to reduce the network bandwidth required.

In addition, telehealth applications require varying levels of image or video quality, depending on how they are being used. If videoconferencing is being used for administrative or conferencing purposes, some loss of audio and video synchronization is tolerable. However, if it is being applied in a diagnostic environment such as a remote consultation with a patient or practitioner, synchronization becomes very important, as it will affect the user's perception of the remote encounter and perhaps even the correctness of a diagnosis.

The constraints on quality are many, however.⁸ System cost can be significant. While low-cost codec systems do exist (one can set up a videoconferencing system with inexpensive web cameras and free software), creating a reliable, high quality system that is easy to operate requires both good codecs and good network bandwidth. Advances in technology have reduced the price of codecs over the last several years, but the devices needed for telehealth applications are still not cheap. Other issues that need to be considered are interoperability among different vendors' devices (although this is less of a problem than it used to be, thanks to the encoding standards) and the physical environment in which the codec will be installed and used. For example, if the codec is going to be used strictly in an examination room, its video display capacity can be lower than one to be used in a large auditorium for training.

How does one decide what kind of codec to use? For telehealth applications, good quality is important, so it is best to look at products from established manufacturers. Equipment must adhere to the ITU standards. A desktop unit that includes a camera and can double as a computer monitor is a good choice for applications where a single practitioner is providing consultations to remote patients or colleagues. Applications for larger audiences, such as seminars, will require units that can be connected to a television screen or LCD projector. Similarly, observing larger areas (groups or whole rooms) requires a unit that has a camera with a wide-angle field of view and a pan and zoom capacity. Codecs and associated telehealth peripherals can be mounted on mobile racks that allow them to be moved to the point of use. The application primarily determines the configuration of the system; the codec itself will be fairly standard. The primary concerns are the quality, latency and bandwidth required by the codec. One should always use the codec that provides these features at an appropriate level for the application.

Conclusion

Ultimately, the choice of a codec for telehealth depends on the trade-offs between quality, cost, bandwidth requirements and interoperability required by the application and the environment in which the system will be used. Another important factor to be considered is technical support; codecs as used in telehealth are still not ‘plug and play’ devices and require expertise to set up and maintain. Many organizations and facilities investigating telehealth are doing so because they are located where there are limited resources of many kinds, including technical expertise. When having an on-site or on-call technician is not possible, the availability of other forms of support, such as a vendor helpline, can be important. The system's simplicity and reliability are particularly important in clinical telemedicine applications, since the users are typically not technical experts but clinicians with significant time constraints.

Another consideration is which systems are being used by others in a similar region. Telehealth providers are usually happy to share their experience and help new users to determine what is most appropriate for their applications. Finding another group that has developed telehealth applications similar to the ones being considered can be invaluable. In the end, depending on their applications and constraints, most organizations with a commitment to developing telehealth should be able to find satisfactory codecs that will operate with sufficient quality and reliability on the available network.

References

Tracy

. A Guide to Getting Started in Telemedicine . See http://telehealth.muhealth.org/general%20information/getting.started.telemedicine.pdf

(last checked 11 October 2007)

Microsoft Corporation. Using Codecs . See http://www.microsoft.com/windows/windowsmedia/player/faq/codec.mspx

(last checked 11 October 2007)

International Telecommunication Union. Telecommunication Standardization Sector . See http://www.itu.int/ITU-T/index.html

(last checked 11 October 2007)

International Telecommunication Union. Video codec for audiovisual services at p x 64 kbit/s . See http://www.h261.com/doc/h.261.pdf

(last checked 26 October 2007)

Stillerman

. A Look Inside H.239: an Introduction to Using Video and Data Together on a Single Call . See http://www.ihets.org/archive/progserv_arc/research_arc/pubs_arc/H_239_v3_2.pdf

(last checked 11 October 2007)

Wainhouse Research. Emerging Technologies for Teleconferencing and Telepresence. Duxbury, MA: Wainhouse Research, 2005. See http://www.wrplatinum.com/content.aspx?CID=4382

(last checked 11 October 2007)

Siglin

. Latency: the Next Frontier . See http://www.streamingmedia.com/article.asp?id=9657&page=1

(last checked 26 October 2007)

Davis

, Weinstein

. Telepresence 2007: Taking Videoconferencing to the Next Frontier. Duxbury, MA: Wainhouse Research, 2007. See http://www.wrplatinum.com/content.aspx?CID=6749

(last checked 11 October 2007)