Sharovatov’s Weblog

HTTP Chunked Encoding

Posted in browsers, http by sharovatov on 30 April 2009

Did you notice that some pages on the internet start rendering in your browser incrementally, block after block, and on some sites you have to sit and look at the white screen and then get the full page in one second?

There’re two main problems that can make your browser wait for the whole page to load before it starts parsing it:

  1. if you’re using IE6 and the page you’re viewing has table-based layout without COLGROUPs/COLs specifying width or table-layout: fixed CSS rule.
  2. if the page’s not being served from server using chunked encoding.

The first issue is really simple – IE6 has to know the exact width of the columns before it starts displaying the table, and if you have table-layout: fixed rule for the table or COLs with specified width – it will wait for the whole content to load, calculate the width and only then display the table. Other browsers (such as Opera, Firefox, Google Chrome) and newer versions of IE don’t have this issue and start displaying content right after they get at least a piece of it.

So while the first issue is really simple, the second is definitely more interesting!

Normally when HTTP client receives a response, it parses HTTP headers and then tries to read from the input the exact amount of bytes as specified in Content-Length header. So if it takes 3 seconds for the server-side script to prepare the page, HTTP client (and the user!) will be just waiting with opened connection for these seconds.

What was OK at the time when HTTP1.0 was invented and the web had almost only static content, authors of HTTP1.1 thought was inacceptable for era of web applications.

And HTTP 1.1 Spec introduces a concept of “Chunked” transfer encoding:

The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message

The main goal of HTTP Chunked Encoding is to allow clients to parse and display data immediately after the first chunk is read!

Here’s a sample of HTTP response with chunked encoding:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: text/html

c
<h1>go!</h1>

1b
<h1>first chunk loaded</h1>

2a
<h1>second chunk loaded and displayed</h1>

29
<h1>third chunk loaded and displayed</h1>

0
 

As you may see, there’s no Content-Length field in the beginning of the message, but there’s a hexadecimal chunk-size before every chunk. And 0 with two CRLFs specifies the end of the payload.

So the server doesn’t need to calculate Content-Length before it starts serving data to client. This is an amazing functionality! It means that the server can start sending the first part of the response while still processing the other parts of it.

Say, you have a dynamic page with two elements, both of which are queried from the database. So you can either wait for both queries to finish, populate your template with results and send the whole page to client, or you can get first query result, send it in one chunk to the client, then do the next query and send its results in another chunk. You may not notice the difference between chunked and normal serving mode in most of the cases – but if the page is created from different sources or it takes significant time to prepare the data – user experience may be seriously improved.

Before the widespread popularization of AJAX (another Microsoft-invented technology) – Chunked Encoding was used as a core for so-called “Server Push” approach for building web-chats and other streaming purposes. The idea was simple – server didn’t close the HTTP connection and kept on sending chunk after chunk with new messages or other data. This approach had serious drawbacks – e.g. for each new client server had to instantiate a new connection (which eats resources), browsers had a limit on waiting time, so the page had to be reloaded once in a while and so on. But anyway, Chunked Encoding was widely used.

In my company we use Chunked Encoding to show loading progressbar in our online train tickets ordering system – we serve the first chunk with nicely styled <div id=”loading”></div> and when the data for the main table is ready, we serve it in the second chunk. And after the document is fully loaded, javascript routine hides <div id=”loading”></div> :) Simple and nice.

Share :

Tagged with:

Silverlight Smooth Streaming and Akamai

Posted in http by sharovatov on 29 April 2009

Just noticed that the smoothhd.com which I mentioned in my previous post serves media from AKAMAI CDN servers! Also found that AKAMAI has a contract with Microsoft to deliver Smooth Streaming content – and that’s just great. It means that if you have webcasts or any other video you want to deliver to the maximum audience with any bandwidth and CPUs, Akamai and Silverlight Smooth Streaming would be an ideal solution – you won’t even need to host video files on your servers! Or you can start with streaming from your own server and later if required you can always seamlessly switch to Akamai.

And here’re some nice videos from MIX09 (about silverlight) that I’ve found today:

As soon as I get my IIS7, I’ll definitely try streaming something :)


Share :

Silverlight smooth streaming and HTTP

Posted in http by sharovatov on 28 April 2009

I’ve read about smooth streaming technology, and I must say, I just love the way it works. It automatically and smoothly adjusts video quality and allows clients to view smooth video online regardless of their bandwidth and without a need to sit and wait staring at “buffering” message for ages – what it does it dynamically changes the video quality based on network bandwidth, CPU load and other factors.

It’s idea and implementation are so simple and beautiful that I wonder why nobody didn’t invent it earlier. This is what steps you have to follow to make it work:

  1. encode your video file in Expression Encoder 2 Service Pack 1
  2. upload it to IIS7 web server with IIS7 Smooth Streaming extension
  3. point your Silverlight player to the appropriate URL

That’s it. Here’s a showcase how this technology works – awesome!

Looks simple, right? It is simple, but there’s a huge amount of work hidden beside this simplicity. Let me dive into technical details a little bit :)

First of all, let me give you some background. Originally there were basically two types of streaming – stateful streaming and progressive download.

Good example of stateful streaming is RTSP. Client had to initiate connection to the server and send commands like PAUSE or PLAY, and server sent back video stream packets, client waited for its playback buffer to be filled with data and started playback. RTSP worked both over UDP and TCP (port 554 was used).

Progressive download – where client was sending traditional HTTP GET request and server responded with video data sent with use of HTTP chunked encoding, and client started the playback as soon as its playback buffer had enough data to play.

Both approaches had serious issues – RTSP couldn’t work for clients behind proxies or firewalls without extra efforts (that’s why Apple had to spend time inventing the wheel tunnelling RTSP over HTTP), progressive download couldn’t work fine for situation where bandwidth wasn’t good enough – have you had wonderful time sitting and staring at “Buffering” message?

So if you want to give highest video quality to users on a high bandwidth but still want to show users with low bandwidth at least something – you’ll create several versions of the same video and give users a way to choose and watch what they want.

But what if a user doesn’t know what bandwidth he’s got? What if the player itself could automatically select what video to download – high-res or low-res? What if the player could change bitrate during the playback if network conditions or CPU load changed? What if the player could instantly start playback from any point of the movie? And what if pure HTTP was used so that there would be no issues with proxies? What if each chunk of video could be perfectly cached by HTTP agent, such as proxy?

That’s precisely how Microsoft Silverlight Smooth Streaming works.

First of all, Microsoft decided to switch from their ASF format to MP4. There were many reasons for that, but the main point is that MP4 container specification allows content to be internally organised as a series of fragments, so-called “boxes”. Each box contains data and metadata, so that if metadata is written before the data, player can have required information about the video before it plays it.

So what does Expression Encoder do? It allows you to easily create multiple versions of the same video for different bitrates in this fragmented MP4 format. So you get up to 10 versions of the same video file with different resolution – from 320×200 up to 1080 or 720p. Each file internally is split in 2-seconds chunks, each chunk has its own metadata so you can programmatically identify the required chunk. Plus Expression Encoder creates two complimentary files (both follow SMIL XML standard) – *.ISM – server manifest file, which basically just describes to server which file versions have what bitrates; and *.ISMC, which tells a client what bitrates can be used and how many fragments files have.

Can you see the idea? IIS Smooth Streaming extension just maps URL to a chunk in a file. You do a HTTP GET request to a URL like this:

http://test.ru/mov.ism/QualityLevels(400000)/Fragments(video=61024)

And IIS Smooth Streaming extension checks “mov.ism” manifest file to find filename of the file with requested quality level (400000), opens and parses this file to get the chunk with requested time offset (61024). Then this chunk is returned to you in a normal HTTP response.

So you can query for any chunk of any one of your video files with the requested time offset.

Let me repeat it – you encoded your original video file into 10 fragmented video files with different bitrate. And you have a way to query for any chunk in any of these files.

So to play 10 seconds of video you have to do 5 consequent HTTP requests. As we have versions of the same video with different bitrate, we can get first chunk in the worst quality to see how it renders and what time it takes to download it, and then if CPU load is low and network is fast, we can query next 4 chunks with higher bitrate.

And that’s exactly what Silverlight Media Player component does – it requests chunk by chunk from the server and changes “QualityLevels” parameter in URL if conditions change. For example, if Silverlight Media Player sees that CPU load is too high and it’s dropping frames, or network becomes too slow, it changes “QualityLevels” parameter to a lower value so IIS Smooth Streaming extension serves next chunks from the smaller file with lower video quality.

Actually, when user starts the playback, first thing that Silverlight Media Player does is a request for ISMC file to find out how many different bitrate versions server has (and how to identify frames). And only then it composes URL to get the first chunk of video. Simple and beautiful technology.

So what do we have? Video plays smoothly – on old slow internet channels in lower quality and in full HD on fast internet and good CPUs. As HTTP is used as a transport – therefore no issues with proxies or firewalls; as each chunk is identifiable by a unique URL, every single chunk can be perfectly cached by proxies or other HTTP agents.

And as this technology is quite simple, there’s no doubt that there will be a similar module for other web servers, or even web applications achieving similar functionality!

Yes, as it’s encoded in multiple bitrate versions, it takes up to 6 times more space for one movie/clip, but if that’s what it takes to provide users with smooth playback in any conditions – I’m for it!

Thanks for another great technology, IIS Team!

Links:


Share :

Embedded fonts

Posted in browsers, css by sharovatov on 27 April 2009

Long long ago (before IE4, yes, IE4) Microsoft  proposed a standard called EOT (Embedded OpenType) which allowed you to embed any font on your website – all you had to do was to prepare eot fonts in a free WEFT tool (see nice how-to) and then reference them in your CSS:

@font-face {
    font-family: myFont;
    src: url(myfont.eot);
}

h1 { font-family: myFont; }

It’s interesting to know that support for @font-face property appeared in CSS2.0 without specifying of font format, then was suddenly dropped in CSS2.1 and now is back in CSS3.

And now, 10 years later after this feature has been introduced in IE4, all other browsers are slowly starting to implement embedded fonts support. As always, browser vendors talk about compatibility more than actually support this compatibility – while the technology is 10 years old and quite mature, none of popular browsers supports or plans to support EOT – only IE.

And this silent boycott of EOT looks extremely weird because EOT has got a unique feature – font file in this format can be much smaller than a TTF/OTF file due to subsetting. And EOT is not proprietary any more – Microsoft has submitted it to W3C. The only reason browser vendors say stops them from implementing EOT is DRM, but:

  1. as Mark Wubben says, using OTF/TTF can be violating fonts EULA while EOT was designed to follow the rules.
  2. there’re free fonts that can be embedded.

And it’s really funny to read rants like this – if there’s a law, you can’t just violate it because you think it’s too hard to follow it.

And while browser vendors pretend they don’t see the industry standard implementation of the technology, we’ll have to use something like this:

@font-face {
   font-family: myFont;
   src: url(font.eot);
}

@font-face {
   font-family: myFont;
   src: url(font.ttf) format("truetype"),
      url(font.eot) format("embedded-opentype");
}

I.e. set the @font-face twice – for IE and other browsers. More crap for us, developers. Thanks to Opera, Mozilla and Safari.

Update: Thanks to Philip Taylor, author of great web fonts application, he pointed in comments that I was wrong saying that TTF/OTF didn’t support subsetting – they did! But my point is still the same – why inventing other standards when there’s a working one?

Links:

Tagged with: , ,

Raphaёl – excellent JS vector graphics library

Posted in browsers, javascript by sharovatov on 26 April 2009

When you need to create charts or do other graphically-reach stuff in your web application, you’re usually going to choose flash or silverlight, which is fine, but right a kerfuffle! :) I mean, you need to know one more technology while you could achieve nearly the same results just with Javascript and vml/svg.

And here comes Raphaёl – awesome javascript library for providing a cross-browser way to operate with vector graphics written by Dmitry Baranovsky. Check out its demos – really impressive and works in all major browsers – IE6+, Firefox 3.0+, Safari 3.0+ and Opera 9.5+; and Safari on iPhone! It leverages VML functionality for IE and SVG for other browsers. While John Resig is still working on his processingjs.com (and it’s not working in IE at the moment), we already have a well-supported and easy-to-use library.

I did some tests of Raphaёl, and its performance was sufficient to use it in a production environment.

Thanks, Dmitry, really nice library!

P.S. and Dmitry’s blog on javascript is really worthy to read!

P.P.S. heh, VML is another thing that’s been invented by Microsoft (and Macromedia), then proposed to W3C as a standard; but W3C has always its own weird way – they decided to create SVG spec instead. I mean, really, spend time to spec what’s already been spec’ed so that the new spec doesn’t match the old spec – isn’t it weird?


Share :

Tagged with: , ,

IE8 Accelerators – find broken links on a page

Posted in IE8 by sharovatov on 25 April 2009

I created another one accelerator – “Find broken links on a page”. It allows you to use W3C links checker utility to quickly check if any of your links on the page are broken – just right click on a page to get the context menu, go to Accelerators and press “Find broken links” :) May be really useful.

I submitted four of my IE8 accelerators to a IE8 Addons gallery today and now waiting for approval :) Hope they will be approved and I’ll create more!

P.S. To submit your new Accelerator or Webslice, you need to register at ieaddons.com, then go to the Upload page, provide name, description, screenshot and XML of your Accelerator and then wait for the approval :)


Share :
Tagged with:

Various IE8 Accelerators – thepiratebay, php.net, pagerank, alexa

Posted in IE8 by sharovatov on 24 April 2009

I recently updated to Internet Explorer 8, and it just rocks! One of the new shiny features that’s been added to IE8 is Web Accelerators. I’ve just written some to make my life easier a bit. Hope you find them handy.

Search category:

  1. Search on thepiratebay – just select any text on the page, wait for Accelerators icon, and click “Search on ThePirateBay” and it wil open you a new tab with search results

SEO category:

  1. Check Google Pagerank of the page by its link – just right-click on any link, go to Accelerators, hover on “Check pagerank” and you’ll get it’s pagerank directly fetched from google.
  2. Check Alexa rating of the page by its link – the same instructions – right click on a link, go to Accelerators, hover on “Check Alexa rating” and you’ll see this page’s Alexa rating in a preview window.
  3. Check Alexa+pagerank – a little bit more useful Accelerator which combines both alexa and pagerank checks in one accelerator, but is hosted on my server – and again it shows Alexa rating and Google Pagerank on the preview screen.

Web-development:

  1. PHP manual for selected property – just select a function name on the page, wait for the Accelerators icon to appear, click on “Search definition on php.net” and you’ll get a page with php.net manual loaded for this function
  2. Validate a page – right click on any link to get the context menu, go to Accelerators and select “Validate a page” and you’ll get w3.org validator checking this link!

Remember, Accelerators work only in IE8, and clicking on these links in other browsers will just show the XML source of accelerators :)

It took me really half an hour to grasp the manual and write these six accelerators – thanks Microsoft for such an easy and productive platform.

Some useful links:

  1. http://en.wikipedia.org/wiki/Internet_Explorer_8#Accelerators
  2. IE Accelerators Gallery
  3. How to write an IE8 Accelerator for your website
  4. Creating custom accelerators


Share: 
Tagged with:

McAfee scanalert img src

Posted in http by sharovatov on 23 April 2009

Recently I’ve blogged about URL Common Internet Scheme Syntax, and today when I was adding ADDTHIS bookmark button to our corporate website, I noticed that McAfee Secure uses this scheme when specifying source URL for the McAfee Secure image:

<img width="94" height="54" border="0" src="//images.scanalert.com/meter/{companyname}/13.gif" alt="McAfee Secure sites help keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams" oncontextmenu="alert(‘Copying Prohibited by Law – McAfee Secure is a Trademark of McAfee, Inc.’); return false;">

Here we can see the best usecase for the URL notation I’ve been talking about – it doesn’t matter if client visits a page with this code in HTTPS or HTTP mode – the image will always be there. No overhead for cases when clients are in HTTP mode, no mixed content security warnings when the page’s accessed in HTTPS.

Well done, McAfee!


Share :

Tagged with: , , ,

rel=”canonical”

Posted in no category by sharovatov on 22 April 2009

Recently found out a very useful SEO stuff supported by Google (here), Microsoft Live Search (here), Yahoo (here) and Ask.com (here) – canonical links. Basically, if you have a page accessible by multiple URLs, from search engine’s prospective you’re bloating its database by serving duplicate content. So to avoid this, you have to set <link rel="canonical" href="..."> with href pointing to the original URL of this page. Search engines will check the url in href attr and won’t put a duplicate content penalty on your pages.

Useful thing to remember and use!

WordPress (here) and RoR (here) have already got plugins for canonical URL support. Hope to see support for this useful rel in other frameworks, forum and blog engines.

And I just found an interesting post on this topic by Anne van Kesteren with interesting discussion going in comments.


Share :

startpanic.com and :visited links privacy issue

Posted in browsers, privacy, security by sharovatov on 21 April 2009

Back in April 2008 I was blogging about Selectors API support in IE8 Beta 1 support and mentioned the security concern about :visited links – potential privacy theft.

The problem

This concern was risen long ago in CSS2.1 Spec (and also mentioned then in the following specs – CSS3 Selectors, Selectors API spec):

Note. It is possible for style sheet authors to abuse the :link and :visited pseudo-classes to determine which sites a user has visited without the user’s consent.

UAs may therefore treat all links as unvisited links, or implement other measures to preserve the user’s privacy while rendering visited and unvisited links differently.

The original bugzilla issue was reported back in October 2000, Stanford sameorigin whitepapers had this issue described in 2002, then lots of articles followed, and then Ajaxian had an article in 2007 which made this issue really popular.

And now we have http://startpanic.com with nice implementation of this approach – it has a txt database of some thousands URLs that are tested for being visited.

You can check the code – it’s pretty straight forward – links from the database are appended to the iframe where :visited links are displayed and others are hidden, then current style of the current link is checked and if it’s hidden, this link is appended to the big list of visited links.

Possible solutions

Basically, there’re some ways to resolve this issue:

  1. try to protect :visited links computed style access
  2. limit support of :visited
  3. don’t fix it, find a way around

Protect :visited links programmatically

That’s clearly useless. People were suggesting many solutions (which you can get round), like making getComputedStyle return default value for :visited links as if they are not visited – but you can make the case more complex, e.g.

a:link span {display: none;}
a:visited span { display: block }

and then use getComputedStyle to check the span; and all the proposed solutions were weak in some way. But even if you manage to make scripts unaware about the state of your links, there will always be a server-side attack vector – for those links that you want to check you can just specify a unique background-image pointing to some server-side tracking script, e.g.:

#alQaedaLnk:visited {background:url(http://www.cia.gov/track.pl); })

Like here, for example. So it clearly shows us that there’s no way (or it’s too troublesome) to fix this “issue” programmatically.

Limiting support for :visited

As I understood from the discussion with startpanic.com author, he wants to limit :visited support so that only links to pages to the same domain are applied with :visited pseudo class. But this would hurt user experience so much! First example that comes to my head is Google and other search engines – they all colour visited links differently so you can clearly see which pages you’ve already been on and which not. If same-domain policy is applied, all the links in Google search results will be plain blue. This sounds awful to me.

Guys from Stanford security group suggest applying :visited only to those links that were visited from the current domain. This approach was used in Firefox add-on called SafeHistory (it doesn’t work any more). So if you do a search in Google and visit some pages, :visited will be applied to these pages only in Google search results. So if you then do a search on MSN Live Search, all the links there will be plain blue and :visited won’t be applied to them. To me this solution looks weird as well. And Firefox developers said that it would be a problem to support this; and I don’t think other browser vendors will fix this privacy “issue” in that way. Keep on reading, I will explain why.

So from technical prospective the only easy solution would be to completely remove support for :visited pseudo class, which nobody will do because user experience will suffer and people will complain.

Best solution – find a way around

You may think – why not make :visited support configurable in the browser UI?  But that’s what all browsers already have! You can specify that you don’t want to keep history at all, you won’t see visited links anywhere, you will feel that you’re “safe” :).

Another solution – Private Browsing mode

Another nice option is to use Private Browsing mode that’s supported by all modern browsers IE8, Safari, Google Chrome (and then FF3.1 joined). When you visit a site that you don’t want to appear in the history – use Private Browsing mode and you’re safe.

Note: currently Google Chrome has a bug – it applies :visited pseudo class to links in Incognito Mode. However, the bug is fixed and the bugfix will be included in one of new updates.

“Private Browsing” browser feature is the only true solution to this issue.

Here’s a testcase. I visited both http://ya.ru and http://www.google.com links in IE8 InPrivate mode, then went to the testcase page and it didn’t tell anything as if I hadn’t visited these URLs.

When I followed the same process in Google Chrome “Incognito mode”, it showed that I visited both ya.ru and Google.com. This bug is fixed and will be updated in newer versions of Google Chrome.

And this issue is also fixed in FF3.1b3:

In comments Avdeev said that Safari in its Private Browsing mode (they call it Porn mode) didn’t show if the link was visited or not. Great stuff!

Update: It seems that Opera 10 will have Private browsing mode support as well – they are already choosing the name for it – and the most voted one is “Phantom mode” :)

Note: while I understand the whole concern about privacy, you shouldn’t forget that all search engines, adds providers and many many others gather statistics about your visits. When you’re in London (and other major cities), you’re being watched on CCTV constantly, does it bother you? Does this new world leave any space for privacy?

Links:

Share :
Tagged with: