Technology | DOI: 10.1145/1378727.1378733
Logan Kugler
ubiquitous Video
Scalable and distributed video coding offers
the promise of two-way, real-time video.

LARISSA, A BRAzILIAN foreign-language student studying in Tokyo, gets a call on her cell phone just as she arrives at her apartment after classes. She peers at the phone’s display and sees her mother sitting in the living room of the family’s home in São Paulo, plus a blinking blue dot indicating the call is a live, two-way video stream. Larissa flips open her phone.

“Mama, do you like my new haircut?” Larissa asks as she lets herself into her apartment. “Is it too short?”

“No, it looks terrific,” says her mother. “I have some video of your father’s birthday party. Please turn on your TV.”

“Okay,” replies Larissa, who points her cell phone at the 50-inch, flat-panel television on her living room wall and pushes a button. The television flashes awake, picks up the video stream from the phone, and displays a high-quality video of her family celebrating her father’s 49th birthday at his favorite restaurant in São Paulo.

One phone call, one stream of information. The cell phone takes only the data it needs for its two-inch display while the 50-inch television monitor takes far more data for its greater resolution—all from the same video stream.

Welcome to the future world of scalable, distributed video.

Digital video coding compresses the original data into fewer bits while achieving a prescribed picture quality, which it accomplishes largely by eliminating redundancies. Image data for a static background object, for instance, is stored just once, with subsequent frames merely pointing back to the original and registering only incremental changes.

Today’s video coding paradigm exploits temporal and spatial redundancies—think of them together as repetitive elements over time—with a series of predictions, a set of represen-

tations, and a slew of cosine calculations. The goals are to remove the details the human eye can’t see (whether they’re too fast, dark, or small), set aesthetic rules (such as color and aspect ratio), tailor the bit and frame rates for the highest picture quality at the lowest file size, and save as much bandwidth as possible.

A video stream is broken up into pictures that are not necessarily encoded in the order in which they are played back. Encoders append such commands as “for blocks 37–214, duplicate the same blocks in the last frame,” and quantize the transform coefficients to control for the limitations of human visual perception. Finally, entropy coding acts to control the statistical redundancy of the resulting coded symbols.

It’s not quite instant, but in fairly short order video encoders produce a digital video file, a fraction of its original size, for an iPod, laptop, or cell phone. And with advances in scalable and distributed video coding, two-way,

real-time video, such as Larissa’s conversation with her mother, is becoming a reality.

Robust encoding

Hybrid coding, which leverages both the temporal/predictive and frequency domains, is the basis for most current video standards. It does the hard work at the encoding step, resulting in complex encoders but just basic decoders.

A downlink model of a few encoders serving many distributed decoders serves applications for TV and cable broadcasting and on-demand Web video very well, but it makes decoder complexity its focus. Today’s challenge, on the other hand, is the proliferation of wireless mobile devices— from cell phones and Internet tablets to laptops—that rely on up-links to deliver data. This requires capable device-based encoders.

In addition to robust encoding, these emerging applications require improved compression and increased

References:

Archives