Tue Jul 7 09:23:02 CEST 2009

Video Codecs

Time to get an overview of different techniques used in video codecs,
to see how they relate to resource use.  In general, MPEG-1 is mostly
for video on CD, MPEG-2 is directed at TV (broadcast + DVD) with more
emphasis on robustness, delay, different modes, ... and has an
improved audio codec.  MPEG-4 adds improved video coding and a more
general-purpose decoder, but is a hodge-podge of optionally
implemented features. Part 2 (divx) and 10 (avc) are important.

MPEG-1 video [1]: 

  - Group-Of-Pictures with Ineter-frame (keyframe) encoded as +- JPEG,
    Predicted-frame difference to previous frame incorporating motion
    vectors on macroblocks (16x16 = 4 luma 8x8 + 1 chroma 8x8).
    Bidirectional frame using forward and backward frame as
    reference.  DC frames serve as "thumbnails" for fast-forward.

  - Motion estimation works on a fixed diamond region using quarter
    pixels.  MVs are differentially encoded from neighbouring
    macroblocks (16x16 = 4 luma 8x8 + 1 chroma 8x8).

  - The DC part of the DCT coefficients is encoded differentially.
    The AC is coded in a zig-zag pattern (most energy is in the upper
    left corner around DC) which is then RLE encoded.  Quantization
    uses 5 bits (0-31).  Thresholding is adaptive (or user

  - The whole bitstream is Huffman encoded.


  - Systems section: Transport Stream (lossy media like broadcast) and
    Program Stream (reliable media like DVD).

  - Video similar to MPEG-1, optimized for higher bitrates and more
    different formats (i.e. interlaced).  Audio part contains AAC,
    which is more efficient, flexible and robust.  (Part 2 = H.262)


  - More advanced video coding + object oriented design.  Decoder
    behaves more like a rendering engine.  Variable block size motion

  - Part 2: DIVX

  - Part 10: H.264 / AVC (HD-DVD, Blue-ray) Many additions[9].  Highly
    nontrivial decoding/encoding.

Other formats:

  - Theora (in Ogg container).  Open but less well performing codec.

  - H.261 Low bit rate (ISDN) video conferencing.

  - H.263 Low bit rate video converencing.  A variant (Sorenson H.263)
    is used in Apple Quicktime and Adobe Flash Video.  Original base
    for Real Video.  Part of 3GPP (MMS).

[1] http://en.wikipedia.org/wiki/MPEG-1#Part_2:_Video
[2] http://en.wikipedia.org/wiki/Advanced_Audio_Coding
[3] http://en.wikipedia.org/wiki/MPEG
[4] http://en.wikipedia.org/wiki/Flv
[5] http://en.wikipedia.org/wiki/Theora
[6] http://en.wikipedia.org/wiki/H.261
[7] http://en.wikipedia.org/wiki/H.263
[8] http://en.wikipedia.org/wiki/Video_codec
[9] http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC