paul simpson wrote: > hello all, > > i have been testing gluster as a central file server for a small animation > studio/post production company. my initial experiments were using the fuse > glusterfs protocol - but that ran extremely slowly for home dirs and general > file sharing. we have since switched to using NFS over glusterfs. NFS > has certainly seemed more responsive re. stat and dir traversal. however, > i'm now being plagued with three different types of errors: > > 1/ Stale NFS file handle > 2/ input/output errors > 3/ and a new one: > $ l -l /n/auto/gv1/production/conan/hda/published/OLD/ > ls: cannot access /n/auto/gv1/production/conan/hda/published/OLD/shot: > Remote I/O error > total 0 > d????????? ? ? ? ? ? shot > > ...so it's a bit all over the place. i've tried rebooting both servers and > clients. these issues are very erratic - they come and go. > > some information on my setup: glusterfs 3.1.2 > > g1:~ # gluster volume info > > Volume Name: glustervol1 > Type: Distributed-Replicate > Status: Started > Number of Bricks: 4 x 2 = 8 > Transport-type: tcp > Bricks: > Brick1: g1:/mnt/glus1 > Brick2: g2:/mnt/glus1 > Brick3: g3:/mnt/glus1 > Brick4: g4:/mnt/glus1 > Brick5: g1:/mnt/glus2 > Brick6: g2:/mnt/glus2 > Brick7: g3:/mnt/glus2 > Brick8: g4:/mnt/glus2 > Options Reconfigured: > > > performance.write-behind-window-size: 1mb > > > performance.cache-size: 1gb > > > performance.stat-prefetch: 1 > > > network.ping-timeout: 20 > > > diagnostics.latency-measurement: off > > > diagnostics.dump-fd-stats: on > > > that is 4 servers - serving ~30 clients - 95% linux, 5% mac. all NFS. Mac OS as a nfs client remains untested against Gluster NFS. Do you see these errors on Mac or Linux clients? > other points: > - i'm automounting using NFS via autofs (with ldap). ie: > gus:/glustervol1 on /n/auto/gv1 type nfs > (rw,vers=3,rsize=32768,wsize=32768,intr,sloppy,addr=10.0.0.13) > gus is pointing to rr dns machines (g1,g2,g3,g4). that all seems to be > working. > > - backend files system on g[1-4] is xfs. ie, > > g1:/var/log/glusterfs # xfs_info /mnt/glus1 > meta-data=/dev/sdb1 isize=256 agcount=7, agsize=268435200 > blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=1627196928, imaxpct=5 > = sunit=256 swidth=2560 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=32768, version=2 > = sectsz=512 sunit=8 blks, lazy-count=0 > realtime =none extsz=4096 blocks=0, rtextents=0 > > > - sometimes root can stat/read the file in question while the user cannot! > i can remount the same NFS share to another mount point - and i can then > see that with the same user. I think that may be occurring because NFS+LDAP requires a slightly different authentication scheme as compared to a NFS only setup. Please try the same test without LDAP in the middle. > > - sample output of g1 nfs.log file: > > [2011-02-18 15:27:07.201433] I [io-stats.c:338:io_stats_dump_fd] > glustervol1: Filename : > /production/conan/hda/published/shot/backup/.svn/tmp/entries > [2011-02-18 15:27:07.201445] I [io-stats.c:353:io_stats_dump_fd] > glustervol1: BytesWritten : 1414 bytes > [2011-02-18 15:27:07.201455] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 001024b+ : 1 > [2011-02-18 15:27:07.205999] I [io-stats.c:333:io_stats_dump_fd] > glustervol1: --- fd stats --- > [2011-02-18 15:27:07.206032] I [io-stats.c:338:io_stats_dump_fd] > glustervol1: Filename : > /production/conan/hda/published/shot/backup/.svn/props/tempfile.tmp > [2011-02-18 15:27:07.210799] I [io-stats.c:333:io_stats_dump_fd] > glustervol1: --- fd stats --- > [2011-02-18 15:27:07.210824] I [io-stats.c:338:io_stats_dump_fd] > glustervol1: Filename : > /production/conan/hda/published/shot/backup/.svn/tmp/log > [2011-02-18 15:27:07.211904] I [io-stats.c:333:io_stats_dump_fd] > glustervol1: --- fd stats --- > [2011-02-18 15:27:07.211928] I [io-stats.c:338:io_stats_dump_fd] > glustervol1: Filename : > /prod_data/xmas/lgl/pic/mr_all_PBR_HIGHNO_DF/035/1920x1080/mr_all_PBR_HIGHNO_DF.6084.exr > [2011-02-18 15:27:07.211940] I [io-stats.c:343:io_stats_dump_fd] > glustervol1: Lifetime : 8731secs, 610796usecs > [2011-02-18 15:27:07.211951] I [io-stats.c:353:io_stats_dump_fd] > glustervol1: BytesWritten : 2321370 bytes > [2011-02-18 15:27:07.211962] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 000512b+ : 1 > [2011-02-18 15:27:07.211972] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 002048b+ : 1 > [2011-02-18 15:27:07.211983] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 004096b+ : 4 > [2011-02-18 15:27:07.212009] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 008192b+ : 4 > [2011-02-18 15:27:07.212019] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 016384b+ : 20 > [2011-02-18 15:27:07.212030] I [io-stats.c:365:io_stats_dump_fd] > glustervol1: Write 032768b+ : 54 > [2011-02-18 15:27:07.228051] I [io-stats.c:333:io_stats_dump_fd] > glustervol1: --- fd stats --- > [2011-02-18 15:27:07.228078] I [io-stats.c:338:io_stats_dump_fd] > glustervol1: Filename : > /production/conan/hda/published/shot/backup/.svn/tmp/entries > > ...so, the files not working don't have lifetime, read/written lines after > their log entry. > I'll need the log for the NFS server in TRACE log level when you run a command that results in any of the errors above. i.e. stale file handle, remote IO error and input/output error. Thanks > all very perplexing - and scary. one thing that reliably fails is using svn > working dirs on the gluster filesystem. nfs locks keep being dropped. this > is temporarily fixed when i view the file as root (on a client) - but then > re-appears very quickly. i assume that gluster is upto something as simple > as having svn working dirs? > > i'm hoping i've done something stupid which is easily fixed. we seem so > close - but right now, i'm at a loss and loosing confidence. i would > greatly appreciate any help/pointers out there. > > regards, > > paul > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users