5. Compressing the video signals:
DBS video compression uses an algorithm, essentially a mathematical recipe, called MPEG2, with some newer services using a variation called MPEG4, providing better quality at lower bit rates. These and most other forms of audio and video coding use lossy compression — some of the sound and picture information is discarded as part of the compression process. When the signals are decompressed, a subjectively pleasing signal is recovered, but lacking the full gamut of the original, much having been discarded during the encoding process. Examples of lossless compression include data compression algorithms like zip, arj, zoo and lzh, put their compression range and data types is very limited.
The following is a quick once-over of the highlights of the MPEG2 video compression process. It's a bit complicated, so take a deep breath, and read on!
Video compression involves the removal of redundancies from the video signal — and there are many redundancies. Some predictable things repeat up to thousands of times per second, like the synchronization and blanking signals (following each scan line, and between each video frame) can be defined in a lookup table in the receiver. This process is similar to automatically pasting in a word processor program. This alone reduces the bit rate needed by about 20%.
Video can be digitally low-pass filtered by changing the number of times it is sampled during the active portion of each scanning line (the part containing the changing picture information). The more samples per line, the sharper the image appears. Low-pass filtering also removes some of the thermal noise present on the video signal which otherwise also needs to be encoded. Generally, the premium channels are allocated more samples per line than the lower-cost programming.
Human vision is quite limited under some circumstances. For example, as the angular motion of an object increases, the eye's ability to see detail falls off. Video compression takes advantage of this by providing low-pass filtering (or sometimes called subsampling) which may increase automatically as the angular momentum increases. If the detail is beyond the limits of visual acuity, why waste the extra bits? And at the receive location, the image can be adequately approximated using motion prediction (see below).
Also, on program material like news or quiz shows, the true video frame rate can be dropped from about 30 frames per second (fps) to some lower value. The missing information can again be approximated using the motion prediction mentioned below.
Color information (as distinguished from the luminance or monochrome information) is subsampled at half the number of samples used for the luminance channel. Once again, this is a human visual acuity aspect that is exploited by modern video compression.
Perhaps the most powerful compression is then done by first forming the video data stream into 8 x 8 picture element (pixel) blocks. Each block is then scanned diagonally, in a serpentine manner, and then a discrete two-dimensional cosine transform is performed on each block's data. This converts the temporal and spatial domains into the frequency (or transform) domain. The less significant coefficients are then truncated (discarded). This process can also reduce noise in the image by a small amount — the discontinuous data in each block tends to be included in the truncated data.
Another key part of compression is quantization, the reduction of the number of shades of gray available for each pixel, from 256 shades to some lower value. This is achieved by combining two or three (or more) adjacent gray scale values into a new single value. But when over-done, this can significantly degrade an image's subjective quality.
Another important part of video compression is called motion prediction, motion compensation or loop filtering, depending upon the context or the particular standards document. The transmit-codec buffers, or temporary bit storage, stores two frames of video, and compares the video blocks between both frames. Having been analyzed, the blocks or groups of blocks can then be assigned a vector amplitude, direction and rate of change. This information is then passed along, allowing the receive decoder to predict motion for various parts of the television image. The prediction information is much more compact than the actual new-image data. Sometimes, if the codec is incorrectly set up, in a low-bitrate environment, and with high-contrast lighting, this can produce "interesting" received artifacts, like a nose periodically detaching from a person's face.
The final coding technique used is called Huffman coding, where the statistically most common digital patterns in the data stream are replaced with terse codes from a lookup table. This is a bit of lossless coding that is useful for video compression.
There are lots of other techniques in the video compression cookbook with names like transform mode, conditional replenishment, intra-frame coding, various forms of differential coding, field repeat, frame repeat, buffer overflow and so forth. For this paper though, the above should whet your interest. But do remember that the degree and types of processing actually carried out is based on the actual bandwidth available, the rate of change within the image (motion), and the amount of detail moving (plus other less critical items). The available transmission bandwidth is controlled by a statistical multiplexer, discussed in the next paragraph.
The various television signals making up each transponder's 23 Mbps capacity are managed by a device known as a statistical multiplexer or statmux. Each statmux accepts digital signal from a group of MPEG2 video encoders. For DBS, these transmit-only codecs take the uncompressed analog audio and video signals and compresses the signal to typically between 1.5 Mbps and 15 Mbps. The statmux program monitors all the codecs and sends control information to each on the instantaneous available bit rate available for each.
The better encoders are fully programmable, and can have their algorithm adjusted dynamically, as conditions change.
The set top box or Integrated Receiver-Decoder (IRD) essentially undoes the encoding process in reverse order. However, it cannot do better than approximate data which was discarded during the encoding process.
Disclaimer and Copyright
© 2006 - 2008 Acleris
|