Gluster 1.3.10 Performance Issues

amar at zresearch.com (Amar S. Tumballi) · Thu, 7 Aug 2008 12:14:38 -0700

Chris,
 Thanks for benchmark numbers. I will check this. Recently I too observed
this type of behavior. Will get back with some inputs and a fix probably.

Regards,
Amar

2008/8/7 Keith Freedman <freedman at freeformit.com>

> At 08:20 AM 8/7/2008, Chris Davies wrote:
> >I'm not convinced that this is a network or hardware problem.
>
> it doesn't sound like it to me either.  what's the server stats while
> you're untarring?
>
> Hopefully one of thegluster devs will step in with some thoughts.
>
>
> > >
> > >
> > > Hope that wasn't confusing.
> > >
> > > At 10:05 PM 8/6/2008, Chris Davies wrote:
> > >> A continuation:
> > >>
> > >> I used XFS & MD raid 1 on the partitions for the initial tests.
> > >> I tested reiser3 and reiser4 with no significant difference
> > >> I reraided to MD Raid 0 with XFS and received some improvement
> > >>
> > >> I NFS mounted the partition and received bonnie++ numbers similar to
> > >> the best clientside AFR numbers I have been able to get, but,
> > >> unpacking the kernel using nfsv4/udp took 1 minute 47 seconds
> > >> compared
> > >> with 12 seconds for the bare drive, 41 seconds for serverside AFR and
> > >> an average of 17 minutes for clientside AFR.
> > >>
> > >> If I turn off AFR, whether I mount the remote machine over the net or
> > >> use the local server's brick, tar xjf of a kernel takes roughly 29
> > >> seconds.
> > >>
> > >> Large files replicate almost at wire speed.  rsync/cp -Rp of a large
> > >> directory takes considerable time.
> > >>
> > >> Both QA releases I've attempted of 1.4.0 have broken within minutes
> > >> using my configurations.  1.4.0qa32 and 1.4.0qa33.  I'll turn debug
> > >> logs on and post summaries of those.
> > >>
> > >> On Aug 6, 2008, at 2:48 PM, Chris Davies wrote:
> > >>
> > >> > OS: Debian Linux/4.1, 64bit build
> > >> > Hardware: quad core xeon x3220, 8gb RAM, dual 7200RPM 1000gb WD
> > >> Hard
> > >> > Drives, 750gb raid 1 partition set as /gfsvol to be exported, dual
> > >> > gigE, juniper ex3200 switch
> > >> >
> > >> > Fuse libraries: fuse-2.7.3glfs10
> > >> > Gluster: glusterfs-1.3.10
> > >> >
> > >> > Running bonnie++ on both machines results in almost identical
> > >> numbers,
> > >> > eth1 is reserved wholly for server to server communications.  Right
> > >> > now, the only load on these machines comes from my testbed.
> > >> There are
> > >> > four tests that give a reasonable indicator of performance.
> > >> >
> > >> > * loading a wordpress blog and looking at the line:
> > >> > <!-- 24 queries. 0.634 seconds. -->
> > >> > * dd if=/dev/zero of=/gfs/test/out bs=1M count=512
> > >> > * time tar xjf /gfs/test/linux-2.6.26.1.tar.bz2
> > >> > * /usr/sbin/bonnie++ /gfs/test/
> > >> >
> > >> > On the wordpress test, .3 seconds is typical.  On various gluster
> > >> > configurations I've received between .411 seconds (server side afr
> > >> > config below) and 1.2 seconds with some of the example
> > >> > configurations.  Currently, my clientside AFR config comes in at .
> > >> 5xx
> > >> > seconds rather consistently.
> > >> >
> > >> > The second test on the clientside AFR results in 536870912 bytes
> > >> (537
> > >> > MB) copied, 4.65395 s, 115 MB/s
> > >> >
> > >> > The third test is unpacking a kernel which has ranged from 28
> > >> seconds
> > >> > using the Serverside AFR to 6+ minutes on some configurations.
> > >> > Currently the clientside AFR config comes in at about 17 minutes.
> > >> >
> > >> > The fourth test is a run of bonnie++ which varies from 36 minutes
> > >> on
> > >> > the serverside AFR to the 80 minute run on the clientside AFR
> > >> config.
> > >> >
> > >> > Current test environment is using both servers as clients &
> > >> servers --
> > >> > if I can get reasonable performance, the existing machines will
> > >> become
> > >> > clients and the servers will be split to their own platform, so, I
> > >> > want to make sure I am using tcp for connections to give as close
> > >> to a
> > >> > real world deployment as possible.  This means I cannot run a
> > >> client-
> > >> > only config.
> > >> >
> > >> > Baseline Wordpress returns .311-.399 seconds
> > >> > Baseline dd 536870912 bytes (537 MB) copied, 0.489522 s, 1.1 GB/s
> > >> > Baseline tar xjf of the kernel, real  0m12.164s
> > >> > Baseline Config bonnie++ run on the raid 1 partition: (echo data |
> > >> > bon_csv2txt for the text reporting)
> > >> >
> > >> > c1ws1,16G,
> > >> > 66470,97,93198,16,42430,6,60253,86,97153,7,381.3,0,16,7534,37,++++
> > >> +,++
> > >> > +,5957,23,7320,34,+++++,+++,4667,21
> > >> >
> > >> > So far, the best performance I could manage was Server Side AFR
> > >> with
> > >> > writebehind/readahead on the server, with aggregate-size set to
> > >> 0mb,
> > >> > and the client side running writebehind/readahead.  That resulted
> > >> in:
> > >> >
> > >> > c1ws2,16G,
> > >> >
> > >>
> >
> 37636,50,76855,3,17429,2,60376,76,87653,3,158.6,0,16,1741,3,9683,6,2591,3,2030,3,9790,5,2369,3
> > >> >
> > >> > It was suggested in IRC that clientside AFR would be faster and
> > >> more
> > >> > reliable, however, I've ended up with the following as the best
> > >> > results from multiple attempts:
> > >> >
> > >> > c1ws1,16G,
> > >> >
> > >>
> >
> 46041,58,76811,2,4603,0,59140,76,86103,3,132.4,0,16,1069,2,4795,2,1308,2,1045,2,5209,2,1246,2
> > >> >
> > >> > The bonnie++ run from the serverside AFR that resulted in the best
> > >> > results I've received to date took 34 minutes.  The latest
> > >> clientside
> > >> > AFR bonnie run took 80 minutes.  Based on the website, I would
> > >> expect
> > >> > to see better performance than drbd/GFS, but, so far that hasn't
> > >> been
> > >> > the case.
> > >> >
> > >> > Its been suggested that I use unify-rr-afr.  In my current setup,
> > >> it
> > >> > seems that to do that, I would need to break my raid set which is
> > >> my
> > >> > next step in debugging this.  Rather than use Raid 1 on the
> > >> server, I
> > >> > would have 2 bricks on each server which would allow the use of
> > >> unify
> > >> > and the rr scheduler.
> > >> >
> > >> > glusterfs-1.4.0qa32 results in
> > >> > [Wed Aug 06 02:01:44 2008] [notice] child pid 14025 exit signal Bus
> > >> > error (7)
> > >> > [Wed Aug 06 02:01:44 2008] [notice] child pid 14037 exit signal Bus
> > >> > error (7)
> > >> >
> > >> > when apache (not mod_gluster) tries to serve files off the
> > >> glusterfs
> > >> > partition.
> > >> >
> > >> > The main issue I'm having right now is file creation speed.  I
> > >> realize
> > >> > that to create a file I need to do two network ops for each file
> > >> > created, but, it seems that something is horribly wrong in my
> > >> > configuration from the results in untarring the kernel.
> > >> >
> > >> > I've tried moving the performance translators around, but, some
> > >> don't
> > >> > seem to make much difference on the server side, and the ones that
> > >> > appear to make some difference client side, don't seem to help the
> > >> > file creation issue.
> > >> >
> > >> > On a side note, zresearch.com, I emailed through your contact
> > >> form and
> > >> > haven't heard back -- please provide a quote for generating the
> > >> > configuration and contact me offlist.
> > >> >
> > >> > ===/etc/gluster/gluster-server.vol
> > >> > volume posix
> > >> >     type storage/posix
> > >> >     option directory /gfsvol/data
> > >> > end-volume
> > >> >
> > >> > volume plocks
> > >> >   type features/posix-locks
> > >> >   subvolumes posix
> > >> > end-volume
> > >> >
> > >> > volume writebehind
> > >> >   type performance/write-behind
> > >> >   option flush-behind off    # default is 'off'
> > >> >   subvolumes plocks
> > >> > end-volume
> > >> >
> > >> > volume readahead
> > >> >   type performance/read-ahead
> > >> >   option page-size 128kB        # 256KB is the default option
> > >> >   option page-count 4           # 2 is default option
> > >> >   option force-atime-update off # default is off
> > >> >   subvolumes writebehind
> > >> > end-volume
> > >> >
> > >> > volume brick
> > >> >   type performance/io-threads
> > >> >   option thread-count 4  # deault is 1
> > >> >   option cache-size 64MB #64MB
> > >> >   subvolumes readahead
> > >> > end-volume
> > >> >
> > >> > volume server
> > >> >     type protocol/server
> > >> >     option transport-type tcp/server
> > >> >     subvolumes brick
> > >> >     option auth.ip.brick.allow 10.8.1.*,127.0.0.1
> > >> > end-volume
> > >> >
> > >> >
> > >> > ===/etc/glusterfs/gluster-client.vol
> > >> >
> > >> > volume brick1
> > >> >     type protocol/client
> > >> >     option transport-type tcp/client # for TCP/IP transport
> > >> >     option remote-host 10.8.1.9   # IP address of server1
> > >> >     option remote-subvolume brick    # name of the remote volume on
> > >> > server1
> > >> > end-volume
> > >> >
> > >> > volume brick2
> > >> >     type protocol/client
> > >> >     option transport-type tcp/client # for TCP/IP transport
> > >> >     option remote-host 10.8.1.10   # IP address of server2
> > >> >     option remote-subvolume brick    # name of the remote volume on
> > >> > server2
> > >> > end-volume
> > >> >
> > >> > volume afr
> > >> >    type cluster/afr
> > >> >    subvolumes brick1 brick2
> > >> > end-volume
> > >> >
> > >> > volume writebehind
> > >> >   type performance/write-behind
> > >> >   option aggregate-size 0MB
> > >> >   option flush-behind off    # default is 'off'
> > >> >   subvolumes afr
> > >> > end-volume
> > >> >
> > >> > volume readahead
> > >> >   type performance/read-ahead
> > >> >   option page-size 128kB        # 256KB is the default option
> > >> >   option page-count 4           # 2 is default option
> > >> >   option force-atime-update off # default is off
> > >> >   subvolumes writebehind
> > >> > end-volume
> > >> >
> > >> > _______________________________________________
> > >> > Gluster-users mailing list
> > >> > Gluster-users at gluster.org
> > >> > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> > >> >
> > >> > >
> > >>
> > >>
> > >> _______________________________________________
> > >> Gluster-users mailing list
> > >> Gluster-users at gluster.org
> > >> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> > >
> > >
> > > !DSPAM:1,489aa3b2286571187917547!
> > >
> >
> >
> >_______________________________________________
> >Gluster-users mailing list
> >Gluster-users at gluster.org
> >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>

-- 
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zresearch.com/pipermail/gluster-users/attachments/20080807/96f3cc11/attachment.htm