Fwd: files not syncing up with glusterfs 3.1.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm working with Paul on this.

We did take advice on XFS beforehand, and were given the impression that it
would just be a performance issue rather than things not actually working.
We've got quite fast hardware, and are more comfortable with XFS that ext4
from our own experience so we did our own tests and were happy with XFS
performance.

Likewise, we're aware of the very poor performance of gluster with small
files. We serve a lot of large files, and we're now moved most of the small
files off to a normal nfs server. Again small files aren't known to break
gluster are they?

David

On 21 February 2011 14:42, Fabricio Cannini <fcannini at gmail.com> wrote:

> Em Sexta-feira 18 Fevereiro 2011, ?s 23:24:10, paul simpson escreveu:
> > hello all,
> >
> > i have been testing gluster as a central file server for a small
> animation
> > studio/post production company.  my initial experiments were using the
> fuse
> > glusterfs protocol - but that ran extremely slowly for home dirs and
> > general file sharing.  we have since switched to using NFS over
> glusterfs.
> >  NFS has certainly seemed more responsive re. stat and dir traversal.
> > however, i'm now being plagued with three different types of errors:
> >
> > 1/ Stale NFS file handle
> > 2/ input/output errors
> > 3/ and a new one:
> > $ l -l /n/auto/gv1/production/conan/hda/published/OLD/
> > ls: cannot access /n/auto/gv1/production/conan/hda/published/OLD/shot:
> > Remote I/O error
> > total 0
> > d????????? ? ? ? ?                ? shot
> >
> > ...so it's a bit all over the place.  i've tried rebooting both servers
> and
> > clients.  these issues are very erratic - they come and go.
> >
> > some information on my setup: glusterfs 3.1.2
> >
> > g1:~ # gluster volume info
> >
> > Volume Name: glustervol1
> > Type: Distributed-Replicate
> > Status: Started
> > Number of Bricks: 4 x 2 = 8
> > Transport-type: tcp
> > Bricks:
> > Brick1: g1:/mnt/glus1
> > Brick2: g2:/mnt/glus1
> > Brick3: g3:/mnt/glus1
> > Brick4: g4:/mnt/glus1
> > Brick5: g1:/mnt/glus2
> > Brick6: g2:/mnt/glus2
> > Brick7: g3:/mnt/glus2
> > Brick8: g4:/mnt/glus2
> > Options Reconfigured:
> >
> >
> > performance.write-behind-window-size: 1mb
> >
> >
> > performance.cache-size: 1gb
> >
> >
> > performance.stat-prefetch: 1
> >
> >
> > network.ping-timeout: 20
> >
> >
> > diagnostics.latency-measurement: off
> >
> >
> > diagnostics.dump-fd-stats: on
> >
> >
> > that is 4 servers - serving ~30 clients - 95% linux, 5% mac.  all NFS.
> >  other points:
> > - i'm automounting using NFS via autofs (with ldap).  ie:
> >   gus:/glustervol1 on /n/auto/gv1 type nfs
> > (rw,vers=3,rsize=32768,wsize=32768,intr,sloppy,addr=10.0.0.13)
> > gus is pointing to rr dns machines (g1,g2,g3,g4).  that all seems to be
> > working.
> >
> > - backend files system on g[1-4] is xfs.  ie,
> >
> > g1:/var/log/glusterfs # xfs_info /mnt/glus1
> > meta-data=/dev/sdb1              isize=256    agcount=7, agsize=268435200
> > blks
> >          =                       sectsz=512   attr=2
> > data     =                       bsize=4096   blocks=1627196928,
> imaxpct=5
> >          =                       sunit=256    swidth=2560 blks
> > naming   =version 2              bsize=4096   ascii-ci=0
> > log      =internal               bsize=4096   blocks=32768, version=2
> >          =                       sectsz=512   sunit=8 blks, lazy-count=0
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> >
> >
> > - sometimes root can stat/read the file in question while the user
> cannot!
> >  i can remount the same NFS share to another mount point - and i can then
> > see that with the same user.
> >
> > - sample output of g1 nfs.log file:
> >
> > [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> > /production/conan/hda/published/shot/backup/.svn/tmp/entries
> > [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
> > glustervol1:   BytesWritten : 1414 bytes
> > [2011-02-18 15:27:07.201455] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 001024b+ : 1
> > [2011-02-18 15:27:07.205999] I [io-stats.c:333:io_stats_dump_fd]
> > glustervol1: --- fd stats ---
> > [2011-02-18 15:27:07.206032] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> > /production/conan/hda/published/shot/backup/.svn/props/tempfile.tmp
> > [2011-02-18 15:27:07.210799] I [io-stats.c:333:io_stats_dump_fd]
> > glustervol1: --- fd stats ---
> > [2011-02-18 15:27:07.210824] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> > /production/conan/hda/published/shot/backup/.svn/tmp/log
> > [2011-02-18 15:27:07.211904] I [io-stats.c:333:io_stats_dump_fd]
> > glustervol1: --- fd stats ---
> > [2011-02-18 15:27:07.211928] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> >
> /prod_data/xmas/lgl/pic/mr_all_PBR_HIGHNO_DF/035/1920x1080/mr_all_PBR_HIGHN
> > O_DF.6084.exr [2011-02-18 15:27:07.211940] I
> > [io-stats.c:343:io_stats_dump_fd]
> > glustervol1:       Lifetime : 8731secs, 610796usecs
> > [2011-02-18 15:27:07.211951] I [io-stats.c:353:io_stats_dump_fd]
> > glustervol1:   BytesWritten : 2321370 bytes
> > [2011-02-18 15:27:07.211962] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 000512b+ : 1
> > [2011-02-18 15:27:07.211972] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 002048b+ : 1
> > [2011-02-18 15:27:07.211983] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 004096b+ : 4
> > [2011-02-18 15:27:07.212009] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 008192b+ : 4
> > [2011-02-18 15:27:07.212019] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 016384b+ : 20
> > [2011-02-18 15:27:07.212030] I [io-stats.c:365:io_stats_dump_fd]
> > glustervol1: Write 032768b+ : 54
> > [2011-02-18 15:27:07.228051] I [io-stats.c:333:io_stats_dump_fd]
> > glustervol1: --- fd stats ---
> > [2011-02-18 15:27:07.228078] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> > /production/conan/hda/published/shot/backup/.svn/tmp/entries
> >
> > ...so, the files not working don't have lifetime, read/written lines
> after
> > their log entry.
> >
> > all very perplexing - and scary.  one thing that reliably fails is using
> > svn working dirs on the gluster filesystem.  nfs locks keep being
> dropped.
> >  this is temporarily fixed when i view the file as root (on a client) -
> > but then re-appears very quickly.  i assume that gluster is upto
> something
> > as simple as having svn working dirs?
> >
> > i'm hoping i've done something stupid which is easily fixed.  we seem so
> > close - but right now, i'm at a loss and loosing confidence.  i would
> > greatly appreciate any help/pointers out there.
> >
> > regards,
> >
> > paul
>
> Hi Paul.
>
> I've been using gluster for ~6 months now, so i'm by no means an expert,
> but i
> can see that you're doing two things that are dicouraged by the devels:
>
> - Using xfs as a backend filesystem
> - Serving small files ( < 1MB of size )
> ( I'm assuming that because of log messages like this )
>
> > [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd]
> > glustervol1:       Filename :
> > /production/conan/hda/published/shot/backup/.svn/tmp/entries
> > [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd]
> > glustervol1:   BytesWritten : 1414 bytes
>
> Can any one please confirm or correct my assumptions ?
>
> TIA
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



-- 
David Lloyd
V Consultants
www.v-consultants.co.uk
tel: +44 7983 816501
skype: davidlloyd1243


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux