Lately I've been accused by some of spreading fear, uncertainty and doubt (FUD) by trying to let people know their search terms are being leaked to the sites they click on. I hope to address those concerns in this post.
For those of you who have no idea what I'm talking about: when you click on a link on the Internet, where you clicked from gets automatically sent to the site you clicked on (most of the time).
For example, if you're on yahoo.com and you click to a story at the New York Times, your browser will send to newyorktimes.com some information that you came from yahoo.com -- namely, the Web address of the page you were just on. This info is called the Referrer.
At issue here is that sometimes the Referrer contains personal information. In particular, when you use most search engines, your search terms are included in the Referrer. That is, when you search on Google/Bing/etc., and you click on a link, your search terms are sent to the site you clicked on. This search leakage doesn't happen at DuckDuckGo.
Now, let's take the FUD arguments in turn.
One site having one of my search terms is irrelevant. That may generally be the case, but unfortunately, tens of millions of sites run ads from just a handful of ad networks. Those ad networks can aggregate your search terms and piece together a large percentage of your search history.
It's not Google's fault. Your browser sends that stuff. That's true, but Google et. al. could easily fix it. It is a technically trivial fix. In fact, Google had done it for a bit when they switched to using Ajax.
So the question then becomes if you're a company that cares about user privacy and can easily stop third-parties from piecing together your users' search histories, why wouldn't you do it?
In other words, I find this FUD argument to be a straw man argument. While you can fault the browser or the Internet, that doesn't mean someone who is able shouldn't come in and fix it.
It would hurt SEO. The only reason I've heard to not prevent search leakage is that marketers use Referrer info to do better search engine optimization (SEO).
But the information doesn't have to disappear, just the current mechanism of transferring the information in a personally identifiable way. Google et al. could provide sites with the information in an anonymous fashion. At that point, I think the only thing marketers couldn't do would be to dynamically serve you different pages based on your personal search terms.
So the question then becomes is that trade-off worth it?
Google Webmaster Tools (GWT) doesn't provide that full information. Matt Cutts wants me to stop saying GWT can solve this marketer problem because while GWT provides a lot of information, it does not currently provide all the terms people search for to get to your site. That's true; sorry Matt.
But the key word is currently. There is no reason I can see why it couldn't provide a more comprehensive view into this data.
Google provides ways to opt-out. The only thing I know that somewhat protects you from Referrers is Google's encrypted version, which doesn't protect you fully (because https->https traffic still sends Referrer headers).
Most people have no idea that the encrypted version is related to this problem, or that it even exists. Furthermore, you still can't just type in https://google.com/ to get there (you have to add the www.).
But all that is besides the point, because you shouldn't have to opt-out of this search leakage in the first place. Your search results won't suffer -- Google still has your history.
Therefore, it should be the default. Matt says SSL can't be the default because of latency, but that is another straw man argument IMHO. You don't need SSL to solve this problem as evidenced by their Ajax incident and DuckDuckGo.
You're just attacking Google when Bing et al. do it too. I want everyone to solve this issue and I've tried to put "et al." in this post a lot. However, the reality is Google is synonymous with search. Despite what search market share #s say (I still don't grok them), pretty much everyone I talk to about search talks about Google.
In any case -- Bing, Yahoo, etc. -- if you're listening, please solve this issue at your search engines too.
To summarize, here's my basic argument:
1) Search engines say they care about user privacy.
2) They are currently allowing third-parties to aggregate user search history by not blocking the browser from sending search terms in the Referrer header.
3) There is an easy fix.
So why isn't the fix a no-brainer?
Here is a representative example of feedback emails I get on this subject. I got this user's permission to share.
I just replaced Google with DuckDuckGo as my default search engine. I'm VERY tired of having advertisers jump all over me everytime I do a search for, well, anything.
For example: watching THE TUDORS on iTunes, one of the characters had gout. I wanted to know if gout was a recognized disease during the time of the Tudors. So I Googled "gout", and checked out the wikipedia entry on the subject. Turns out it was in fact a recognized disease at the time (although they had no idea what caused it). I don't have the disease. I don't personally know anyone who does. I certainly don't have any need for medications that treat gout. But now I'm constantly bombarded with ads for all kinds of drugs intended to treat it.
All I did was get currious, just once, about a disease suffered by a TV character on a show I like to watch, and now every advertiser on the planet is apparently convinced that either I, or someone I know, has gout, and they're not about to pass up even the most minuscule chance of selling me something.
Here's the official response Google gave to Wired:
"It's unfortunate that DuckDuckGo is preying on people's fears and offering incomplete information in order to garner attention," a company spokeswoman said in an e-mailed statement.
"For example, it is inaccurate to say that Google uses sensitive health-related terms to target ads on affiliated web pages."
"All search engines and websites use referrer terms as part of the architecture of the web, but we recognize our responsibility to protect the data that users entrust to us and we give them meaningful choices to protect their privacy."
The meaningful choice here would be to drop the personal information from the Referrer.
Finally, I'm not alone in this call to action. Christopher Soghoian, who previously worked at the FTC and had been a Google intern, filed an FTC complaint in October of last year on this very subject. Here's his post on it and the associated WSJ post.