Saturday, 24 March 2007

High Availability: DRBD rcks

On Thursday/Friday this week, I visited Linbit in Vienna. They are the creators of DRBD. Quoting Wikipedia,

DRBD is an acronym for Distributed Replicated Block Device. It is a Linux kernel module, that, working together with some scripts, offer a distributed storage system, frequently used on high availability clusters. DRBD works as a kind of network RAID.

This means DRBD can give high availability to MySQL users. Through configuring DRBD to be used on your system, you can have synchronous replication between two different servers, giving a MySQL database a failover server to redirect to instantaneously, should the main server running MySQL fail.

For those interested in more detail on how to combine DRBD and MySQL, let me mention that Kristian Köhntopp of MySQL has written a great blog article on "Quick tour of DRBD".

I was impressed when listening to DRBD's main author DI Philipp Reisner describing the technical workings and business opportunities of DRBD. In many respects, he reminds me of our very own Monty years ago.

I also learnt plenty of things from Florian Haas, Senior Software Engineer with Linbit. Among other things, he taught me that r is a vowel (in many of Austria's neighbouring countries), meaning that you can pronounce DRBD without spelling out the letters. Sounds like "Good day!" in Slovenian.

On a more serious note, I think the prospects for DRBD look fascinating. Or in other words, remembering my recent insight on vowels: DRBD rcks!

1 comment:

  1. [...] Kaj Arnö has written an excellent blog post on the basics of DRBD. DRBD has one great feature that binary log replication doesn’t have. It can ensure that a write is synced to disk on two different hosts before allowing the application to continue. This is great for data redundancy but it introduces potential for instability in the setup. In a good fail over scenario a problem on the backup master should never cause an issue on the primary master. With DRBD the second master lagging behind because of a degraded raid, network issue, operator error, name your poison causes issues on the primary master because MySQL has to wait for writes to be synced to disk on _both_ machines before continuing. I know there are 3 different protocol modes that DRBD can operate in. Protocol C is really the only one that gives any extra data security over binary log replication so it’s the one I’m focusing my attention on. If an issue on one master causes problems on another then the benefit of having redundant masters is effectively lost. [...]