Re: Performance tuning for MySQL

Gordan Bobic <gordan@xxxxxxxxxx> · Wed, 11 Feb 2009 09:18:23 +0000

David Sickmiller wrote:

I'm running 2.0rc1 with the 2.6.27 kernel.  I have a 2-node cluster.  
GlusterFS runs on both nodes, and MySQL runs on the active node.  If the 
active node fails or is put on standby, MySQL fires up on the other 
node.  Unlike MySQL Replication with its slave lag, I know my data 
changes are durable in the event of a server failure.  Most people use 
DRBD for this, but I'm hoping to enjoy GlusterFS's benefits of handling 
split-brain situations at the file level instead of the volume level, 
future scalability avenues, and general ease of use.  Hopefully DRBD 
doesn't have unmatchable performance advantages I'm overlooking.

Note that DRBD resync is more efficient - it only resyncs dirty blocks, 
which in the case of big databases, can be much faster. Gluster will 
copy the whole file.

I'm going to report my testing in order, because the changes were 
cumulative.  I used server-side io-threads from the start.  Before I 
started recording the speed, I discovered that running in single process 
mode was dramatically faster.  At that time, I also configured 
read-subvolume to use the local server.  At this point I started measuring:

    * Printing schema: 18s
    * Compressed export: 2m45s

For a benchmark, I moved MySQL's datafiles to the local ext3 disk (but 
kept writing the export to GlusterFS).  It was 10-100X faster!

    * Printing schema: 0.2s
    * Compressed export: 28s

Did you flush the caches inbetween the tries? What is your network 
connection between the nodes?

There was no appreciable changes from installing fuse-2.7.4glfs11, using 
Booster, or running blockdev to increase readahead from 256 to 16384.

Adding the io-cache client-side translator didn't affect printing the 
schema but cut the export in half:

    * Compressed export: 1m10s

Going off on a tangent, I shut down the remote node.  This increased the 
performance by an order of magnitude:

    * Printing schema: 2s
    * Compressed export: 24s

What is the ping time between the servers? Have you measured the 
throughput between the servers with something like ftp on big files? Is 
it the writes or the reads that slow down? Try dumping to a ext3 from 
gluster.

I resumed testing with both servers running.  Switching the I/O 
scheduler to deadline had no appreciable affect.  Neither did adding 
client-side io-threads, or server-side write-behind.  Surprisingly, I 
found that changing read-subvolume to the remove server had only a minor 
penalty.

Are you using single process client/server on each node, or separate 
client and server processes on both nodes?

Then I noticed that the remote server was listed first in the volfile, 
which means that it gets used for the lock server.  Swapping the order 
in the volfile on one server seemed to cause split-brain errors -- does 
the order need to be the same on both servers?

Yes, the first server listed is the lock server. If you list them in 
different order, locking will break. The order listed is the locking 
server fail-over order.

When I changed both 
servers' volfiles to use the active MySQL server as the lock server, 
there was a dramatic performance increase, to roughly around the 2s/24s 
speed I saw with one server down.  (I lost the exact stats.)

In summary, running in single process mode, client-side io-cache, and a 
local lock file were the changes that made a significant difference.

That makes sense, especially on the local lock file. The time it takes 
to write a lock to page cache is going to be some orders of magnitude 
faster than the ping time, even on gigabit ethernet.

Since I'm only going to have one server writing to the filesystem at a 
time, I could mount it read-only (or not at all) on the other server.  
Would that mean I could safely set data-lock-server-count=0 and 
entry-lock-server-count=0 because I can be confident that there won't be 
any conflicting writes?  I don't want to take unnecessary risks, but it 
seems like unnecessary overhead for my use case.

Hmm... If the 1st server fails, the lock server will fail to the next 
one, and you fire up MySQL there then. I thought you said it was only 
the 2nd server that suffers the penalty. Since the 2nd server will fail 
over locking from the 1st if the 1st fails, the performance should be 
the same after fail-over. You'll still have the active server being the 
lock server.

Gordan