If you're not familiar with the concept of the The Filter Bubble, check out this infographic we put out a few days ago about how it applies to search engines.
Currently, much of the debate seems to be whether segregating results based on personal information is good or not. I think that is the wrong debate and a false dichotomy. It can be good, e.g. isolating movie listings based on your zip code. And it can be bad, e.g. limiting the display of certain political viewpoints based on your search and click history.
From my perspective, the real debate is over a) which personal signals should be used; b) what controls should we (as users) have over how our personal signals are used; and c) how results that arise from the use of our personal signals should be presented. Personal signals are fundamentally different than other signals because as soon as you introduce them different people start getting different results.
The central point of the Filter Bubble argument is that showing different people different results has consequences. By definition, you are segregating, grouping and then promoting results based on personal information, which necessitates less diversity in the result set since other results have to get demoted in the process. Of course you can introduce counter-measures to increase diversity, but that is just mitigating the degree to which it is happening. Consequences that follow from less diversity are things like increasing partisanship and decreasing exposure to alternative viewpoints.
My view is that when it comes to search engines in particular, the use of personal information should be as explicit and transparent as possible, with active user involvement in creating their profiles and fine-grained control over how they are used. Personalization is not a black and white feature. It doesn't have to be on or off. It isn't even one-dimensional. At a minimum users should know which factors are being used and at best they should be able to choose which factors are being used, to what degree and in what contexts.
If you do not do that, and instead rely on implicit inference from passive data collection (searches, clicks, etc.), then the search engine is just left to "guess" at your personal profile. And that's why the examples from The Filter Bubble seem creepy to a lot of people. It seems like the search engine algorithm has inferred political affiliation, job, etc. without being explicitly told by the user.
This is not a conspiracy to segregate people and I'm the farthest from a conspiracy theorist you'll probably find. It's just a natural consequence of algorithms that cluster people.
The questions then become 1) are they clustering me correctly; and 2) even if they are, do I want the fact that I belong to this cluster to influence my results for this particular search or type of search?
Some people may want restaurant recommendations based on the implicit guess of their race and income class. Whether you care about that sort of thing largely determines where you come down on the debate. If you don't care at all, then you probably don't care if you ever know that your results are different from other people and how they differ based on your personal information.
On the other hand, if you do care, then you might want to know how and why a result based on your personal information got in front of you. You might also want to have much more fine-grained control over how particular personal signals are used (akin to privacy settings).
In other words, some people prefer to self-segregate and are interested in any and all forms of "personalization." And some people would prefer that segregation should not occur without explicit user choice.
Please note I'm not disputing that showing people different results may result in "better results" for people. I agree that there is no universal best result for all queries.
What I'm saying is that you can get to that better result for a particular person in a number of ways. And I think that when it comes to search engines in particular, personal signals should be dealt with delicately and with active engagement from the user. It's two paths to the same thing, but the latter involves vastly more user choice and control.