Monday, 10 December 2007

Combining MySQL Proxy with MySQL Cluster

A while ago, I had a discussion with Stewart Smith, Vinay Joosery, Monty Taylor and a number of other MySQLers who know much more about MySQL Cluster than I do. The result is a model for using MySQL Proxy to offload MySQL Cluster from doing Table Scans, without touching the application.

The discussion started from me asking Stewart about the largest road block for expanding the number of use cases for MySQL Cluster. "Oh, that would probably be doing JOINs and other SELECTs requiring the scanning of large parts of the database", he replied. "There, other storage engines are faster, such as MyISAM and InnoDB."

In a very simple view, the application talks SQL with MySQL Cluster, and gets responses.

Stewart's insight can be refined into the first simplistic diagram by adding the recognition that "SQL" can consist of

  1. UPDATE, INSERT, DELETE statements (very light, usually invidual rows affected) -- unidirectional blue arrow in the diagram below

  2. Simple SELECT statements (also very light, defined as SELECTs that use indices and return invidual rows) -- bidirectional black arrow

  3. Complex SELECT statements (could be as easy as "SELECT *", but defined as those not easily using indices and usually returning multiple rows) -- dashed arrows having two arrowheads to show that plenty of data is being returned

This second figure doesn't depict any change in application architecture from the first figure; it just shows a more granular view.

Now, enter the insight that plain MySQL Server (with MyISAM or InnoDB) can deliver the complex SELECTs faster.

In the new architecture represented by the above picture, we scale the application by

  1. introducing Replication (replicating MySQL Cluster to plain MySQL Server)

  2. changing the application to direct the complex SELECTs to MySQL Server instead of MySQL Cluster

This complicates life. Not only do we need to set up replication. We also need to touch the application all over the place, to direct queries to the appropriate server.

Now, enter MySQL Proxy.

Using LUA scripts, MySQL Proxy can relieve us of the second complication, i.e. having to change the application to point to different MySQL Servers depending on the type of the SQL. Let MySQL Proxy parse the traffic and direct it to the appropriate server! The application is left untouched, and the topmost part of the picture again has a simple bidirectional arrow saying "SQL". The distinction of what type of SQL we're talking about is left to MySQL Proxy.

MySQL Proxy can also be assigned to load balancing the queries to a number of MySQL Replication Slaves.

This picture clearly is the most complex architecture depicted here, but also represents the highest level of scaling.

All of the above can be done using current versions of MySQL Cluster and MySQL Server, and the MySQL Proxy.