GlusterFS Performance gigE

roland at jotta.no (Roland Rabben) · Wed, 15 Sep 2010 11:15:40 +0200

Since most servers come standard with two NIC's these days, an easy fix to
improve write performance in a Distributed / Replicated setup is to have
each server of a replicated pair on different sub-nets.

On the client simply configure NIC 1 to handle traffic on subnet 1, and NIC
2 to handle traffic on subnet 2.

Since writes from the clients are done simultaneously to both servers in the
replilcated pair your write performance on a 1 GIG NIC will probably max out
around 55-60 MB/s.

Using two NIC's will double this to about 110-120 MB/s.

This seems to work great in our setup.

Regards

Roland Rabben

2010/9/13 Henrique Haas <henrique at dz6web.com>

> Hello,
>
> I've tested with a mesh of performance translators, and use real data:
>
> My testing set have about 600K files, with files with 43KB on average, all
> JPEG files.
> The total size is about 19GB.
> The underline filesystem is ext4, on its default settings of Ubuntu Server
> 10.04 (it are configured as a LVM volume by the way).
>
> My GlusterFS settings have used:
> *Server:* storage/posix, features/locks, performance/io-threads
> *Client:* 4 remote nodes > 2 Replicate > Write-Behind > IO-Threads >
> QuickRead > Stat-Prefetch
>
> Reading the documentation.. seems Write-Behind is the translator that might
> improve my write speed. I leave it with 4MB as cache-size.
>
> Ok, I did a simple "time cp -r /data /mnt", and results are not
> satisfactory:
> *real** **151m30.720s*
> *user** **0m4.240s*
> *sys** **1m9.160s*
>
> Now, the same copy, but all files joined on a tarball file (17GB):
> *real** **15m14.759s*
> *user** **0m0.130s*
> *sys** **0m31.010s*
>
> Thank you very much by your attention!
>
> Regards!
>
>
>
> On Sat, Sep 11, 2010 at 8:19 PM, Daniel Mons <daemons at kanuka.com.au>
> wrote:
>
> > On Sun, Sep 12, 2010 at 3:20 AM, Henrique Haas <henrique at dz6web.com>
> > wrote:
> > > Hello Jacob,
> > >
> > > Greater block sizes gave me much much better results, about *58MB/s* on
> a
> > > 1GigE !!!!
> > > So.. my concern now is about smaller files be shared using Gluster.
> > > Any tunning tips for these kind of files (I'm using Ext4 and Gluster
> > 3.0.2)?
> >
> > "dd" won't give you accurate results for testing file copies.  Your
> > slow writes with small block sizes are more likely to high I/O and
> > read starve on the client side than the server/write side.
> >
> > You should test something more real world instead.  For instance:
> >
> > for i in `seq 1 1000000` ; do dd if=/dev/urandom of=$i bs=1K count=1 ;
> done
> >
> > That will create 1,000,000 1KB files (1GB of information) with random
> > data on your local hard disk in the current directory.  Most file
> > systems store 4K blocks, so actual disk usage will be 4GB.
> >
> > Now copy/rsync/whatever these files to your Gluster storage.  (use a
> > command like "time cp /blah/* /mnt/gluster/" to wallclock it).
> >
> > Now tar up all the files, and do the copy again using the single large
> > tar file.  Compare your results.
> >
> > From here, tune your performance translators:
> >
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/stat-prefetch
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/quick-read
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/io-cache
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/quick-read
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/writebehind
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/readahead
> >
> >
> http://www.gluster.com/community/documentation/index.php/Translators/performance/io-threads
> >
> > Some of these translators will aggregate smaller I/Os into larger
> > blocks to improve read/write performance.  The links above explain
> > what each one does.  My advice is to take the defaults created by
> > glusterfs-volgen and increment the values slowly on the relevant
> > translators (note that bigger doesn't always equal better - you'll
> > find a sweet spot where performance maxes out, and then most likely
> > reduces again once values get too big).
> >
> > And then continue testing.  Repeat for 4K, 16K, 32K files if you like
> > (or a mix of them) to match what sort of data you'd expect on your
> > file system (or better yet, use real world data if you have it lying
> > around already).
> >
> > Also, if you don't need atime (last access time) information on your
> > files, consider mounting the ext4 file system on the storage bricks
> > with the "noatime" option.  This can save unnecessary I/O on regularly
> > accessed files (I use this a lot on both clustered file systems as
> > well as virtual machine disk images and database files that get
> > touched all the time by multiple systems to reduce I/O).
> >
> > Hope that helps.
> >
> > -Dan
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> >
>
>
>
> --
> Henrique Haas
> +55 (51) 3028.6602
> www.dz6web.com
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>

-- 
Roland Rabben
Founder & CEO Jotta AS
Cell: +47 90 85 85 39
Phone: +47 21 04 29 00
Email: roland at jotta.no