DNS Prefetching (or Pre-Resolving)

Wednesday, September 17, 2008

A major goal of Google Chrome was to improve user enjoyment and value in web surfing. Critical to that is increasing the responsiveness of the browser to user input, or reducing user perceived latency. Measurements in the browser have shown that a significant amount of time is traditionally spent waiting for DNS to resolve domain names. To speed up browsing, Google Chrome resolves domain names before the user navigates, typically while the user is viewing a web page. This is done using your computer's normal DNS resolution mechanism; no connection to Google is used. As a result, user navigation time in Google Chrome when first visiting a domain is on average about 250ms faster than traditional browsing, and the occasional but painful 1-second-plus delays are almost never experienced.

How it works, and how much it helps.

First off, DNS Resolution is the translation of a domain name, such as www.google.com, into an IP address, such as 74.125.19.147. A user can't go anywhere on the internet until after the target domain is resolved via DNS.

The histograms at the end of this post show actual resolution times encountered when computers needed to contact their network for DNS resolutions. The data was gathered during our pre-release testing by Google employees who opted-in to contributing their results. As can be seen in that data, the average latency was generally around 250ms, and many resolutions took over 1 second, some even several seconds.

DNS prefetching just resolves domain names before a user tries to navigate, so that there will be no effective user delay due to DNS resolution. The most obvious example where prefetching can help is when a user is looking at a page with many links to various unexplored domains, such as a search results page. Google Chrome automatically scans the content of each rendered page looking for links, extracting the domain name from each link, and resolving each domain to an IP address. All this work is done in parallel with the user's reading of the page, hardly using any CPU power. When a user clicks on any of these pre-resolved names to visit a new domain, they will save an average of over 250ms in their navigation.  

If you've been running Google Chrome for a while, be sure to try typing "about:dns" into the address bar to see what savings you've accrued! Humorously, this prefetching feature often goes unnoticed, as users simply avoid the pain of waiting, and tend to think the network is just fast and smooth. To look at it another way, DNS prefetching removes the variance from surfing latency that is induced by DNS resolutions. (Note: If about:dns doesn't show any savings, then you probably are using a proxy, which is resolving DNS on the behalf of your browser.)

There are several other benefits that Google Chrome derives from DNS prefetching. During startup, it pre-resolves domain names, such as the home pages, very early in the startup process. This tends to save about 200-500 ms during application startups. Google Chrome also pre-resolves the host names in URLs suggested by the omnibox while the user is typing, but before they press enter. This feature works independently of the broader omnibox logic, and doesn't utilize any connection to Google. As a result, Google Chrome will generally navigate to a typed URL faster, or reach a user's search provider faster. Depending on the popularity of the target domain, this can save 100-250ms on average, and much more in the worst case.

If you are running Google Chrome, try typing "about:histograms/DNS.PrefetchFoundName" into the address bar to see details of the resolution times currently being encountered on your machine.

The bottom line to all this DNS prefetching is that Google Chrome works overtime, anticipating a user's needs, and making sure they have a very smooth surfing experience. Google Chrome doesn't just render and run Java Script at a remarkable speed, it gets users to their destinations quickly, and generally sidesteps the pitfalls surrounding DNS resolution time.

Of course, the best way to see this DNS prefetching feature work, is to just surf.  

Sample of DNS Resolutions Times requiring Network Activity (i.e., over 15ms resolution)

The following is a recent histogram of aggregated DNS resolutions times observed during tests of Google Chrome by Googlers, prior to the product's public release. The samples listed are only those that required network access (i.e., took more than 15 ms). The left column lists the lower range of each bucket.  For example, the first bucket lists samples between 14 and 18ms inclusive. The next three columns show the number of samples in that range, the fraction of samples in the range, and the cumulative fraction of samples at or below that range. For example, in the first bucket, there were 31761 samples in this bucket range, or about 5.10% of all the 6,228,600 samples shown. Looking at the cumulative percentage column (far right), we can see that the median resolution took around 90ms (actually, 52.71% took less than 118ms, but 43.63% took less than 87ms). Reading from the top of the chart, the average DNS resolution time was 271ms, and the standard deviation was 1.130 seconds. The "long tail" may have included users that lost network connectivity, and eventually reconnected, producing extraordinarily long resolution times.


Count: 6,228,600; Sum of times: 1,689,207,135; Mean: 271 ± 1130.67


44 comments:

Enrique Fica said...

google chrome without the toolbar google is not good.

we want the toolbar google

Kevin Quillen said...

The google toolbar is unnecessary, just type google in the address bar then tab to enter a search query. This eliminates having to have two input bars at the top of the browser.

You can enter any major domain name like amazon, or cnn, then hit tab to do a search query against that site before going anywhere.

harald said...

There was a brief flurry of complaints about this practice on the DNS operators list; some were concerned about the increased load.

The discussion died, though, so I'm assuming these fears were unfounded...

Apostolos said...

Does this work for dynamically generated links too ( e.g. links on Google Reader, Gmail, etc )?

Andy Davies said...

I can live without the toolbar, I just want integration with Google Bookmarks

Andrew Dalke said...

Have you just introduced a new sort of webbug? Each page embeds a unique name within some domain. The DNS server for that domain can track views from Chromium simply by seeing if there's a request for a given name. This occurs even if images are turned off.

Robbie said...

Can we make Chrome compatible with Firefox addons? I think so- Go for it!

Andrew Witte said...

@Andrew Dalke:

Not a very reliable webbug, since DNS results will tend to get cached by users' DNS providers (i.e. their ISPs) and therefore the website operator won't see all the prefetch requests.

Andrew Dalke said...

@Andrew Witte

Addresses are cheap. Create a new one for every generated page. After all, it's possible to tunnel IP over DNS lookups.

LarsG said...

@Witte:

Many DNS servers supports wildcards, say *.chromewebbug.example.com

Then all example.com has to do is generate a link with a unique fqdn for each page request. $uniqueid.chromewebbug.example.com. Since it is unique, it won't already be in the isp's dns cache.

Collin Jackson said...

The DNS Prefetching design document describes how to use "DNS prefetch control" to enable or disable DNS prefetching. This feature can be used to prevent web bugs.

Onekopaka said...

@robbie

WebKit can't view XUL, so making Chrome compatible with Firefox extensions is impossible. Also, addons use ns* javascript functions which are only in the Spidermonkey / Tracemonkey JS engine.

Martin said...

i love this browser, but plz fix flash first before doing anything else.

Simetrical said...

@Andrew Witte

What scenario are you envisioning in which a user is capable of inserting a link that's different on every page view, but can't already track views anyway? If you can insert different content per page view, that implies you can add scripts of some kind to the page, so why don't you use that scripting capability to log the visitors directly?

Moreover, even if you can skip DNS caching, you're still not going to get very informative results. As far as I know, all ISPs provide recursive DNS servers for their users. When a user makes a DNS request that misses the ISP's cache, the ISP's server will query the next layer of servers directly, so you're not going to know which user of that (possibly gigantic) ISP visited the page.

So the most you should be able to tell from this is the number of requests from a particular ISP in a particular region, nothing that could usually be linked to individual users -- and that's even assuming you can come up with a case where you can provide per-request domain names but not directly log visitor info. I don't think this is any privacy issue at all, and any hypothetical concerns are certainly be outweighed by the performance gains for the general population.

John said...

The DNS prefetching is a useful feature, but it prevents access to sites at xxx.dyndns.org as these sites are dynamic and change ip address. The DNS prefetch locks it into a single IP address. It would be useful if the dynamic web addresses are excluded fro the prefetch

Simetrical said...

@John

What's an example of a page that doesn't work? You should report that page as a bug for Chrome developers to look at, if you haven't already. I'm assuming that there should be no problems, since Chrome presumably respects TTL and should discard results that are stale.

pkasting said...

@john:

There shouldn't be any problems with dyndns pages. Chromium isn't caching results internally, it's simply asking the OS to look up the IP or a hostname, and letting the OS do whatever caching it wants to (or doesn't). The effect on local DNS caches is identical to if the user clicks the link (in any browser), so the only problem is if the OS itself isn't respecting TTL properly -- and that'd be a problem for any browser, not just Chromium.

Kelson said...

Actually, thinking about it, there is a potential privacy issue with spam if you use Chrome to read email via a web app.

One of the reasons people don't use unsubscribe links is that most of them are bogus. But another reason people don't use unsubscribe links is that many of them are just used to confirm that the email address is actually being read by someone. The spammers don't care what your IP address is, they just want to know their junk has a better chance of getting seen.

Currently, most of these links use one of the spammer's throw-away domains with a unique path. I've occasionally seen it with a unique subdomain.

Pre-resolve the domain, and if they're logging requests they'll know your email address is attended by a real, live person -- even if you don't load images, don't run scripts, and don't click on any links.

John said...

@pkasting & @Simetrical

There is a problem and it has been reported as a bug. I have a server that used dyndns.org to maintain an Internet connection. I use a Wireless ISP for my Internet as the only other option is dialup. They do not offer a static IP.

Chrome will not retrieve any web page from my server after a reconnect, when the IP address changes for my Internet connection, if I stay in the same browser session. As soon as I close Chrome entirely and then start a new Chrome browser it works immediately. I have confirmed that the workstation knows the correct IP for the web server. Unchecking the DNS prefetching does not fix the problem.

KooKiz said...

Kelson > Correct me if I am wrong, but the DNS resolution will only take the domain, not the complete URL (for instance, www.spam.com for www.spam.com/unregister?email@server.com). So they know that a computer with the IP xxx.xxx.xxx.xxx is resolving this domain, but there's no way for them to know which e-mail address it comes from.

Simetrical said...

@KooKiz

Here's the full exploit:

1) Spammer registers evil.com and sets its authoritative DNS server to be his own server.

2) Spammer sends out 10,000,000 e-mails. Each e-mail goes to a different address, and each contains a different link. The first one might be 1.evil.com, the second might be 2.evil.com, etc. The spammer records which domain name was sent to each e-mail address.

3) Some user opens a spam e-mail, intending to report it as spam, or not even realizing it's spam (e.g., hitting "]" in Gmail).

4) Chrome sees the link. It tries to resolve the domain name. Since the domain name is unique, it's not in any DNS caches, and a request ends up (probably via various proxies such as recursive DNS servers) going to the spammer's server.

5) The server sends a response to the request, or not, it doesn't really matter. But it notes which domain name this was and checks it against its database of e-mail addresses. Whichever address the domain name was sent to must be read by someone, so the spammer knows to send e-mail there in the future. As an added bonus, the spammer might know some info about the user's ISP, based on the computer that made the DNS request.

The above description assumes that no one but the reader will resolve the domain name. This may be false: it's possible that some webmail providers will try to follow links in the body of e-mails to help determine whether the e-mail is spam -- e.g., if the domain name resolves to a known spammer's IP, or the content of the link is known to have been widely spread by spam before.

However, even if the mail providers do this, the user's browser will almost certainly end up resolving the domain name again. (It could be set to have a TTL of 0, just to be sure.) It should be pretty easy to distinguish webmail providers' DNS resolutions by access patterns (almost immediate after receipt of e-mail) and IP addresses, and discard those, so this would provide little protection.

On the other hand, it's not like this is a massive exploit. In practice, I'm guessing spammers aren't going to bother removing inactive e-mail addresses often if at all from their lists, because it costs them virtually nothing to send the extra spam mail. So how much practical effect this will have I'm not sure about.

The issue could be avoided if webmail providers (and, potentially, others in similar situations) marked untrusted links as such, maybe using rel="untrusted" or a similar microstandard that could be developed. Or, you could just turn off DNS prefetching.

pkasting said...

@kelson, simetrical:

This is an example of why we provide per-link and per-page methods for web authors to turn off DNS prefetching. Mail sites can disable DNS prefetching on pages containing possibly-spam content, thus rendering this spamming technique futile.

Kelson said...

@simetrical: "In practice, I'm guessing spammers aren't going to bother removing inactive e-mail addresses often if at all from their lists, because it costs them virtually nothing to send the extra spam mail"

Given that they don't seem to remove addresses that actually bounce, you're probably right. But I understand that some spammers make a tidy business of selling lists of "confirmed" addresses.

@pkasting: I should have looked at that design document! It's good to know that sites can disable prefetching on their links if necessary.

One reservation: I didn't think it was legal to put a META tag inside BODY.

Steve Souders said...

This is great to see Chrome so focused on performance. I was hoping to see other info about DNS caching in Chrome. Does Chrome cache DNS resolutions longer than the TTL? Similar to IE’s 30 minute cache, or Firefox’s dnsCacheExpiration setting? How many resolutions does it cache (a la FF’s dnsCacheEntries)? Does re-use of the domain within a certain time period affect the cache, similar to what FF’s network.http.keep-alive.timeout value does?

Although the average in this sample is 271ms, that seems high. The median is 87ms, which is more typical in my experience.

jar said...

@steve souders:

The underlying Chromium implementation relies on the OS to actually cache resolutions. It appears that the evictions from OS cache seem to happen when there are between 50 and 200 entries.

I've seen discussion of the IE "feature" of ignoring TTLs, and guaranteeing at least 30 minutes prior to evictions, but there was no effortt made to replicate this "feature."

Chromium assumes that OS cache evictions can take place after only 5 minutes in the cache (Chromium does not have access to the real TTLs, and so assumes the common short time of 5 minutes is effective). By re-warming periodically (when needed, and when eviction seems probable) Chromium tends to keep the cache up-to-date for current browser navigation needs. As a result, re-use (or simply re-appearance where prefetching is applicable) does impact the cache contents.

Vladislav.Mysla said...

this is not hard to resolve all DNS aliases. but if google will store all data in the cache, then this service can be helpfull... for someone :)

Carlos said...

google chrome without the toolbar google is not good.

we want the toolbar google

I'am 100% agree whith Enrique Fica

Please we need the google toolbar for google chrome to save the favorites

HcqnhpN4o4RIKfhaVaz5cY55IWLMtk8- said...

Thanks, Simetrical, for the detailed description. It's always good to spell it all out clearly.

I think a good solution for this and other problems would be if the webmail provider could tell the browser which part (whole HTML fragment) is untrusted, using some markup, instead of relying on checks on server-side. Gerv from Mozilla Foundation made such a proposal http://www.gerv.net/security/content-restrictions/ . Google Chrome could take this information to not make the DNS lookups in that section, problem solved (assuming it only affects webmail).

I like the idea of DNS prefetching, if not done excessively (from what I read, Chrome does it a bit excessively, even pressing other entries out of the cache), because DNS lookups are indeed a problem for speed. Unfortunately, the proposal above is the only solution I see at the moment. Webmail providers can't strip the domain names, as that would break (valid) links.

The problem could appear at a different place in a different way, too: Google search result pages. The results are not custom per user, but at least allow the provider some rough idea of how often his page is seen on search result pages vs. how often it's clicked. DNS caches would void some of that, but they can lower the TTL again, so the server would only cache for their minimum of 5 minutes or so. Actually, it's possible that the abuse get so out of hand that DNS server admins increase the minimum TTL, breaking valid uses of a low TTL.

BTW: I do see spammers removing email addresses which bounce. It takes a long time (many months), but I no longer get spammed on addresses which I removed 1-2 years ago.

Kamahl said...

@somerandomyahooguy

That spammer probably went out of buisness, and stopped sending to the entire list...

Lendal said...

TOOLBAR or at least my google bookmarks PLEASE

satya said...

plz add google toolbar in it as this is the only lacuna which prevent me to use chrome as default browser.

lrose said...

please fix the flash, thank you for your time

Eric said...

i agree with the request for Google Bookmarks (and/or Google Toolbar) integration. this is a must have feature for me to work on multiple machines and bring my bookmarks with me. otherwise, chrome is quite compelling.

Undocumented said...

well for me it takes for ever to resolve hostnames. Sometime more than half a minute

SoulGuard said...

http://startearnnow.99k.org

Google chrome is Faster, Slimmer and a performance machine when compared to other browsers. Very nice article explaining DNS prefetching.

I think Open DNS does the same.

Bunty said...

I think DNS prefetching although is a novel idea and can potentially improve page load time, I see it as a performance bottleneck. Routers, gateways and DNS servers (in particular) do what Chrome tries to do. My Chrome was crawling and after I disabled DNS prefetching it started to work smoothly and loaded pages in a jiffy. This feature should be selective, and configurable allowing users to manage whats being resolved, fetched and cached

Koalawrangler said...

This post has been removed by the author.

Koalawrangler said...

Google toolbar is central to my browsing experience. It allows me to search against my delicious bookmarks, my flickr account, etc. I can log into it from any computer and all my settings are there. It is the only thing stopping me from switching to Chrome. Until then, I'm sticking with IE and the Google toobar...

Prashanth Gedde N said...

I liked this feature.

But I fear about the DNS cache eviction that can happen if there are too many resolution. We may loose out few of the necessary addresses from cache.

How is this problem tackled?

tejas said...

Not sure why all the toolbar fanatics are posting on this thread..

shaveen said...

chrome is good but lack with GUI, Google should not think this is the way our clients need to think like We provide a very good backend software but front end is not good as other browsers. Customer has to decide this So the google has to give customer the option to choose though it effect the speed of the browser, memory and other aspects which effect the performance.

s. knuijver said...

bandwith is not really an issue over here in the Netherlands so prefetching is something I really like

s. knuijver said...

firefox had fasterfox a real prefetcher is there a similar thingy for chrome? you can just limit the amount of memory dedicated to precaching each site to prevent some tings, i don't know how to limit precaching adclicking sites, by the way since that's how google makes money aren't they doing this? i see a small list of ads in the about dns ..

SEWilco said...

It would be helpful to document the meaning of the histogram. I'm guessing that having a peak at " 8289 ------------------------------------------------------------------------O (5801 = 36.2%) {37.0%}" isn't good.