Replicating PostgreSQL with Bucardo

 
If the title makes no sense to you, PostgreSQL is an open-source database. Replication means continuously copying changes from one database to another, e.g. for backup, scalability or high availability. And Bucardo is one of several pieces of software that help you achieve various forms of Postgres replication.

Why Bucardo?

First off, why Bucardo, especially since Postges 9.0 now has built-in hot standby/streaming replication (full docs)? Steve Signer wrote up some cases where you wouldn't want to/be able to use the built-in replication. A more detailed comparison of the various options is on the PG wiki

I won't repeat everything you can find there, but at the highest level the built-in system works on the whole DB cluster, as opposed to specific databases or even tables. Also, it is designed to do so by essentially mirroring files, so it is highly recommended that the platforms (including OS and Postgres versions) match as exactly as possible. Both of these requirements don't suit me. I run FreeBSD and Ubuntu and would really like to replicate only certain tables within certain databases.

After looking at the various options in detail, I wanted to use something (at least initially) both free and open source. Bucardo is written in Perl (so theoretically I could contribute patches); it is active (I looked at the changelog and mailing list); it is simple to setup and use (I looked at the docs); and it is very flexible (offers multi-master, multi-slave, and even cascading slave configurations).

However, note that Bucardo replication usually lags a few sec behind (depending on network config), so you will lose data if the master goes down and is not recovered properly. That is, it is not synchronous replication as available in 9.1, which confirms that changes are made in both DBs before committing (though also has the drawback of slowing things down a bit).

My stuff is generally read-only on the slaves and data can be out of sync (or lost), so the Bucardo model (which works off of triggers) is perfect for me at this time. Nevertheless, you can do failover via Bucardo, easier with a swap sync.

Bucardo Overview

The Bucardo documentation is pretty good so I won't rehash it all. Installation is straightforward -- you install a few Perl modules and then are left with a command line program called bucardo_ctl, which controls everything. I noticed the online documentation can differ from the man page in details, so you might want to look at both. Also on the wiki, i noticed not everything is in linked properly from the main pages, so you might want to use the search feature if you're looking for something in particular.

The documentation goes through an example if you want to play around with it. Here's an even simpler one:

createdb testa
createdb testb
create table test (id integer primary key);
bucardo_ctl add database testa name=testa
bucardo_ctl add database testb name=testb
bucardo_ctl add table test db=testa
bucardo_ctl add herd test_herd test
bucardo_ctl add sync test_sync source=test_herd targetdb=testb type=pushdelta

The above will create two databases. You should issue the create table command in both. Then you add the databases to the bucardo database (which is stored in a postgres DB named bucardo). A "herd" is just a set of tables to replicate, so we add the test table from testa. Then we add a sync.

bucardo_ctl stop
bucardo_ctl start
testa=# insert into test (id) values (1);
bucardo_ctl status
Ins/Upd/Del:          1 / 0 / 0
testb=# select * from test;
 id
----
  1
(1 row)
Stop and start bucardo, which is the easiest way to ensure the new sync starts. Then insert a row in testa and see it appear in testb.  bucardo_ctl status will tell you that the one row was inserted.

Non-obvious notes

After messing around with this live for a week, I've noticed a bunch of things that were non-obvious (at least to me) that you might want to keep in mind.

  • On install, you may need to add Perl to Postgres or you'll get a weird error. You can do that with this commnd:

    createlang plperl template1 -U pgsql

  • The verbosity flags to bucardo_ctl currently don't do much, e.g. --quiet doesn't make things quiet. To turn off the debug log (which gets big fast because it is high verbosity), use debugfile=0 when starting bucardo_ctl. Bucardo is aware of this issue and will be cleaning this up in future releases.

  • The sendmail=1 flag to bucardo_ctl works, but you have to set your from and to email first by doing bucardo_ctl set default_email_from=whatever, etc. However, if you have a situation where a sync fails repeatedly you'll get a ton of emails, like multiple a sec. So I turned this off for now until that case is fixed (I just reported it).

  • There is a debugdir flag you can pass to bucardo_ctl, If you don't set this, logfiles will get printed to the directory where it is started, so I would cd to that directory first or use the flag.

  • If you have a sync that fails, e.g. from network error, and you keep writing stuff to the master db, when it comes back online it will try to copy everything that changed at once. This can dramatically impact performance on the target. You could do a manual sync up in that case (on an off-peak time and in a secondary table) and avoid the bucardo process doing it for you. If you do that (or decide to just ignore those changes), you need to flush the track and delta tables for that database, e.g.

    psql db bucardo
    delete from bucardo_track;
    delete from bucardo_delta;

    If you also want to delete all the old sync info (since it could have failed quickly thousands of times), you need to do the following.

    psql bucardo bucardo
    delete from q;

    Note that in the first case you are connecting to the db as the bucardo user and not your regular database owner user. In the second case you are connecting to the bucardo db and not the db being replicated.

  • To completely remove bucardo (e.g. if you want to start over), it is not enough to remove the bucardo db since extra tables were added to the other databases (being replicated) and extra triggers to the tables within them. You need to also issue this command.

    psql template1 pgsql
    drop schema bucardo cascade;

  • If you want to completely shutdown a running sync going awry and make sure everything is stopped.

    bucardo_ctl stop
    ps auxww | grep -i bucardo | awk {'print $2'} | xargs kill -TERM

    You want to also issue that command on remote machines to ensure those processes are killed.

  • It initially confused me how to add a remote DB. You add it to bucardo like this:

    bucardo_ctl add database waki name=test host=test.duckduckgo.com

    That is you past in the fully qualified domain name as the host parameter.

  • To test that a sync can work, i.e. it can reach the host in a properly authenticated manner, you can do

    bucardo_ctl validate sync_name

    You will need to do two things for this to complete successfully. First, there needs to be a bucardo user on the remote machine with access to the db. The simplest way is to make it a super user, a la 

    su -m pgsql -c 'createuser -sDRw bucardo'

    on FreeBSD or on Ubuntu like this:

    sudo su -m postgres -c 'createuser -sDRw bucardo'

    Second, you need to allow the host machine to talk to the remote machine, i.e. you need to enable remote access. This is a two step process. In postgresql.conf you need to change listen_address and add in that IP or use * to listen on all. Second in pg_hba.conf you need to add in a way for the remote bucardo user to authenticate. Again the easiest way is to trust that IP, though security-minded people will probably want to shoot me for suggesting it. Either way, you can test it by using psql and passing in host parameters or just issuing the validate command above.

  • The quickest way to tell if things are working is to do

    bucardo_stl status

    which will tell you info on all your syncs or

    bucardo_ctl status sync_name

    for detailed info on one sync. Look at the Last_bad and Last_good times in particular.

  • To remove a sync, it is not enough to remove it in bucardo. You also have to remove the triggers it added on the various tables. You could remove everything as noted above and start over. But it if is just one table, you can just remove the triggers like so:

    psql db dbname
    drop trigger bucardo_add_delta ON table;
    drop trigger bucardo_triggerkick_sync_name ON table;

    Note if you have multiple syncs on the table you don't need to drop the delta table, but just the triggerkick one. If you don't do this and add another sync you'll end up with multiple triggers, which you don't want.

  • If you are replicating to multiple slaves, I found that it is better to use a sync for each one instead of one sync to all (via a db group). The reason is if one is unavailable it will bring down syncing to all, which a) stops syncing for good machines and b) results in that problem above where you do a big sync when everything is back up.

  • To delete a db test without removing your whole install, you can do this:

    bucardo_ctl deactivate sync_name
    bucardo_ctl delete sync sync_name
    bucardo_ctl delete target_db_name
    bucardo_ctl delete herd herd_name
    bucardo_ctl delete table table_name
    bucardo_ctl delete db source_db_name

    Then drop the triggers as noted in the above procedure.

  • When upgrading bucardo, it is not enough to just install the new Perl module. You also need to:

    bucardo_ctl upgrade

Conclusion

That was a lot of notes. While it ended up being more complicated than I originally anticipated, I am still currently happy with this solution. It does work pretty well.

Additionally, the Bucardo mailing list is very responsive. I reported two bugs and they were both fixed last night.
If you have comments, hit me up on Twitter:
I'm the CEO & Founder of DuckDuckGo, the search engine that doesn't track you. More about me.

About Me

RSS.