I often get asked what Duck Duck Go "runs on." This post basically answers that question by outlining the major moving parts that serve queries, i.e. its architecture. I'll detail in another post what, in particular, makes it fast, i.e. tunables and other specifics.
Caveat: this architecture was designed for maximum query speed for our initial soft launch. While also somewhat designed for eventual scalability, we don't have that much traffic yet (though we are growing at a nice clip). So don't take this as advice like you might get at High Scalability. It's really just for your amusement. However, my last startup did have some scale (relatively speaking of course) so I know a bit about what I'm doing...
- DNS served by DNS Made Easy. I used to serve it myself via djbdns, but DNS Made Easy is faster, makes it easier for me to deal with fail-over, and cheap.
- All requests come into nginx. I used to use two instances of Apache, one for dynamic requests and one for static files. But nginx is faster, uses less memory, and is more stable.
- If a static file, nginx serves it directly, e.g. the home page. It's really good at that.
- Otherwise, nginx checks my memcached store. I hadn't used memcached before this, and find it a big win.
- If not in memcached store, nginx proxies to FastCGI processes that are running in the background. I hadn't used FastCGI before this, as I always had used mod_perl with Apache.
- The FastCGI processes are managed by daemontools (as is memcached). At first I was worried about stability in these processes, but it hasn't proved to be an issue yet.
- Internally, the FastCGI scripts are written in Perl and run by the FCGI::Engine Perl module.
- The Perl scripts access a PostgreSQL database (when needed) to retrieve our zero-click information, among other things.
- The whole thing runs on FreeBSD.
- For fail-over and scalability purposes, I have EC2 images that replicate the above except that they run on Ubuntu (since, at the time, FreeBSD wasn't available).
- All of our site icons and zero-click info images are hosted on S3.
- We also reference some external YUI JS files.
Also, I'd love any feedback on this architecture. I'm always looking for ways to speed it up!
Update: additional comments can be found here.