Sharovatov’s Weblog

Silverlight smooth streaming and HTTP

Posted in http by sharovatov on 28 April 2009

I’ve read about smooth streaming technology, and I must say, I just love the way it works. It automatically and smoothly adjusts video quality and allows clients to view smooth video online regardless of their bandwidth and without a need to sit and wait staring at “buffering” message for ages – what it does it dynamically changes the video quality based on network bandwidth, CPU load and other factors.

It’s idea and implementation are so simple and beautiful that I wonder why nobody didn’t invent it earlier. This is what steps you have to follow to make it work:

  1. encode your video file in Expression Encoder 2 Service Pack 1
  2. upload it to IIS7 web server with IIS7 Smooth Streaming extension
  3. point your Silverlight player to the appropriate URL

That’s it. Here’s a showcase how this technology works – awesome!

Looks simple, right? It is simple, but there’s a huge amount of work hidden beside this simplicity. Let me dive into technical details a little bit :)

First of all, let me give you some background. Originally there were basically two types of streaming – stateful streaming and progressive download.

Good example of stateful streaming is RTSP. Client had to initiate connection to the server and send commands like PAUSE or PLAY, and server sent back video stream packets, client waited for its playback buffer to be filled with data and started playback. RTSP worked both over UDP and TCP (port 554 was used).

Progressive download – where client was sending traditional HTTP GET request and server responded with video data sent with use of HTTP chunked encoding, and client started the playback as soon as its playback buffer had enough data to play.

Both approaches had serious issues – RTSP couldn’t work for clients behind proxies or firewalls without extra efforts (that’s why Apple had to spend time inventing the wheel tunnelling RTSP over HTTP), progressive download couldn’t work fine for situation where bandwidth wasn’t good enough – have you had wonderful time sitting and staring at “Buffering” message?

So if you want to give highest video quality to users on a high bandwidth but still want to show users with low bandwidth at least something – you’ll create several versions of the same video and give users a way to choose and watch what they want.

But what if a user doesn’t know what bandwidth he’s got? What if the player itself could automatically select what video to download – high-res or low-res? What if the player could change bitrate during the playback if network conditions or CPU load changed? What if the player could instantly start playback from any point of the movie? And what if pure HTTP was used so that there would be no issues with proxies? What if each chunk of video could be perfectly cached by HTTP agent, such as proxy?

That’s precisely how Microsoft Silverlight Smooth Streaming works.

First of all, Microsoft decided to switch from their ASF format to MP4. There were many reasons for that, but the main point is that MP4 container specification allows content to be internally organised as a series of fragments, so-called “boxes”. Each box contains data and metadata, so that if metadata is written before the data, player can have required information about the video before it plays it.

So what does Expression Encoder do? It allows you to easily create multiple versions of the same video for different bitrates in this fragmented MP4 format. So you get up to 10 versions of the same video file with different resolution – from 320×200 up to 1080 or 720p. Each file internally is split in 2-seconds chunks, each chunk has its own metadata so you can programmatically identify the required chunk. Plus Expression Encoder creates two complimentary files (both follow SMIL XML standard) – *.ISM – server manifest file, which basically just describes to server which file versions have what bitrates; and *.ISMC, which tells a client what bitrates can be used and how many fragments files have.

Can you see the idea? IIS Smooth Streaming extension just maps URL to a chunk in a file. You do a HTTP GET request to a URL like this:

http://test.ru/mov.ism/QualityLevels(400000)/Fragments(video=61024)

And IIS Smooth Streaming extension checks “mov.ism” manifest file to find filename of the file with requested quality level (400000), opens and parses this file to get the chunk with requested time offset (61024). Then this chunk is returned to you in a normal HTTP response.

So you can query for any chunk of any one of your video files with the requested time offset.

Let me repeat it – you encoded your original video file into 10 fragmented video files with different bitrate. And you have a way to query for any chunk in any of these files.

So to play 10 seconds of video you have to do 5 consequent HTTP requests. As we have versions of the same video with different bitrate, we can get first chunk in the worst quality to see how it renders and what time it takes to download it, and then if CPU load is low and network is fast, we can query next 4 chunks with higher bitrate.

And that’s exactly what Silverlight Media Player component does – it requests chunk by chunk from the server and changes “QualityLevels” parameter in URL if conditions change. For example, if Silverlight Media Player sees that CPU load is too high and it’s dropping frames, or network becomes too slow, it changes “QualityLevels” parameter to a lower value so IIS Smooth Streaming extension serves next chunks from the smaller file with lower video quality.

Actually, when user starts the playback, first thing that Silverlight Media Player does is a request for ISMC file to find out how many different bitrate versions server has (and how to identify frames). And only then it composes URL to get the first chunk of video. Simple and beautiful technology.

So what do we have? Video plays smoothly – on old slow internet channels in lower quality and in full HD on fast internet and good CPUs. As HTTP is used as a transport – therefore no issues with proxies or firewalls; as each chunk is identifiable by a unique URL, every single chunk can be perfectly cached by proxies or other HTTP agents.

And as this technology is quite simple, there’s no doubt that there will be a similar module for other web servers, or even web applications achieving similar functionality!

Yes, as it’s encoded in multiple bitrate versions, it takes up to 6 times more space for one movie/clip, but if that’s what it takes to provide users with smooth playback in any conditions – I’m for it!

Thanks for another great technology, IIS Team!

Links:

Share :

Advertisements