Liam Slusser wrote: > I've been playing with booster unfs and found that i cannot get it to work > with a gluster config that uses cluster/distribute. I am using Gluster > 2.0.3... Thanks. I've seen the stale handle errors while using both replicate and distribute. The fixes are in the repo but not part of a release yet. Release 2.0.5 will contain those changes. In the mean time, if you're really interested, you'd check out the repo as: $ git clone git://git.sv.gnu.org/gluster.git ./glusterfs $ cd glusterfs $ git checkout -b release2.0 origin/release-2.0 Also, we've not yet announced it on the list but a customised version of unfs3 is available at: http://ftp.gluster.com/pub/gluster/glusterfs/misc/unfs3/0.5/unfs3-0.9.23booster0.5.tar.gz It has some bug fixes, performance enhancements and work-arounds to improve behaviour with booster. Some documentation is available at: http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration Thanks Shehjar > > [root at box01 /]# mount -t nfs store01:/intstore.booster -o > wsize=65536,rsize=65536 /mnt/store > mount: Stale NFS file handle > > (just trying it again and sometimes it will mount...) > > [root at box01 /]# mount -t nfs store01:/store.booster -o > wsize=65536,rsize=65536 /mnt/store > [root at box01 /]# ls /mnt/store > data > [root at box01 store]# cd /mnt/store/data > -bash: cd: /mnt/store/data/: Stale NFS file handle > [root at box01 /]# cd /mnt/store > [root at box01 store]# cd data > -bash: cd: data/: Stale NFS file handle > [root at box01 store]# > > Sometimes i can get df to show the actual cluster, but most times it gives > me nothing. > > [root at box01 /]# df -h > Filesystem Size Used Avail Use% Mounted on > <....> > store01:/store.booster > 90T 49T 42T 54% /mnt/store > [root at box01 /]# > > [root at box01 /]# df -h > Filesystem Size Used Avail Use% Mounted on > <...> > store01:/store.booster > - - - - /mnt/store > > > However as soon as i remove the cluster/distribute from my gluster client > configuration file it works fine. (Missing 2/3 of the files because my > gluster cluster has a "distribute" of 3 volumes per each of the two servers) > > A strace of unfs during one of the cd commands above outputs: > > poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22, > revents=POLLIN|POLLRDNORM}]) > poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}]) > read(22, > "\200\0\0\230B\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"..., > 4000) = 156 > tgkill(4574, 4576, SIGRT_1) = 0 > tgkill(4574, 4575, SIGRT_1) = 0 > futex(0x7fff31c7cb20, FUTEX_WAIT_PRIVATE, 1, NULL) = 0 > setresgid(-1, 0, -1) = 0 > tgkill(4574, 4576, SIGRT_1) = 0 > tgkill(4574, 4575, SIGRT_1) = 0 > setresuid(-1, 0, -1) = 0 > write(22, "\200\0\0 > B\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36 > poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22, > revents=POLLIN|POLLRDNORM}]) > poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}]) > read(22, > "\200\0\0\230C\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"..., > 4000) = 156 > tgkill(4574, 4576, SIGRT_1) = 0 > tgkill(4574, 4575, SIGRT_1) = 0 > setresgid(-1, 0, -1) = 0 > tgkill(4574, 4576, SIGRT_1) = 0 > tgkill(4574, 4575, SIGRT_1) = 0 > setresuid(-1, 0, -1) = 0 > write(22, "\200\0\0 > C\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36 > poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23, > events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000 <unfinished ...> > > With the booster.fstab debug level set a debug, this is all that shows up in > the log file: > > [2009-07-23 02:52:16] D > [libglusterfsclient-dentry.c:381:libgf_client_path_lookup] > libglusterfsclient: resolved path(/) to 1/1 > [2009-07-23 02:52:17] D [libglusterfsclient.c:1340:libgf_vmp_search_entry] > libglusterfsclient: VMP Entry found: /store.booster/: /store.booster/ > > my /etc/booster.conf > > /home/gluster/apps/glusterfs-2.0.3/etc/glusterfs/liam.conf /store.booster/ > glusterfs > subvolume=d,logfile=/home/gluster/apps/glusterfs-2.0.3/var/log/glusterfs/d.log,loglevel=DEBUG,attr_timeout=0 > > my /etc/exports > > /store.booster myclient(rw,no_root_squash) > > my client gluster config (liam.conf): > > volume brick1a > type protocol/client > option transport-type tcp > option remote-host server1 > option remote-subvolume brick1a > end-volume > > volume brick1b > type protocol/client > option transport-type tcp > option remote-host server1 > option remote-subvolume brick1b > end-volume > > volume brick1c > type protocol/client > option transport-type tcp > option remote-host server1 > option remote-subvolume brick1c > end-volume > > volume brick2a > type protocol/client > option transport-type tcp > option remote-host server2 > option remote-subvolume brick2a > end-volume > > volume brick2b > type protocol/client > option transport-type tcp > option remote-host server2 > option remote-subvolume brick2b > end-volume > > volume brick2c > type protocol/client > option transport-type tcp > option remote-host server2 > option remote-subvolume brick2c > end-volume > > volume bricks1 > type cluster/replicate > subvolumes brick1a brick2a > end-volume > > volume bricks2 > type cluster/replicate > subvolumes brick1b brick2b > end-volume > > volume bricks3 > type cluster/replicate > subvolumes brick1c brick2c > end-volume > > volume distribute > type cluster/distribute > subvolumes bricks1 bricks2 bricks3 > end-volume > > volume readahead > type performance/read-ahead > option page-size 2MB # unit in bytes > option page-count 16 # cache per file = (page-count x page-size) > subvolumes distribute > end-volume > > volume cache > type performance/io-cache > option cache-size 256MB > subvolumes readahead > end-volume > > volume d > type performance/write-behind > option cache-size 16MB > option flush-behind on > subvolumes cache > end-volume > > I've tried removing the performance translators with no change. Once i > remove distribute and only connect to one of the three bricks on a server it > works perfect. > > I do have similar cluster that uses replicate but no distribute and it > works fine. > > ideas? This a bug? > > thanks, > liam > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users