booster unfs with cluster/distribute doesn't work...

shehjart at gluster.com (Shehjar Tikoo) · Thu, 23 Jul 2009 16:33:31 +0530

Liam Slusser wrote:
> I've been playing with booster unfs and found that i cannot get it to work
> with a gluster config that uses cluster/distribute.  I am using Gluster
> 2.0.3...

Thanks. I've seen the stale handle errors while using both
replicate and distribute. The fixes are in the repo but
not part of a release yet. Release 2.0.5 will contain those
changes. In the mean time, if you're really interested, you'd
check out the repo as:

$ git clone git://git.sv.gnu.org/gluster.git ./glusterfs
$ cd glusterfs
$ git checkout -b release2.0 origin/release-2.0

Also, we've not yet announced it on the list but a customised version
of unfs3 is available at:
http://ftp.gluster.com/pub/gluster/glusterfs/misc/unfs3/0.5/unfs3-0.9.23booster0.5.tar.gz

It has some bug fixes, performance enhancements and work-arounds
to improve behaviour with booster.

Some documentation is available at:
http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration

Thanks
Shehjar

> 
> [root at box01 /]# mount -t nfs store01:/intstore.booster -o
> wsize=65536,rsize=65536 /mnt/store
> mount: Stale NFS file handle
> 
> (just trying it again and sometimes it will mount...)
> 
> [root at box01 /]# mount -t nfs store01:/store.booster -o
> wsize=65536,rsize=65536 /mnt/store
> [root at box01 /]# ls /mnt/store
> data
> [root at box01 store]# cd /mnt/store/data
> -bash: cd: /mnt/store/data/: Stale NFS file handle
> [root at box01 /]# cd /mnt/store
> [root at box01 store]# cd data
> -bash: cd: data/: Stale NFS file handle
> [root at box01 store]#
> 
> Sometimes i can get df to show the actual cluster, but most times it gives
> me nothing.
> 
> [root at box01 /]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> <....>
> store01:/store.booster
>                        90T   49T   42T  54% /mnt/store
> [root at box01 /]#
> 
> [root at box01 /]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> <...>
> store01:/store.booster
>                          -     -     -   -  /mnt/store
> 
> 
> However as soon as i remove the cluster/distribute from my gluster client
> configuration file it works fine.  (Missing 2/3 of the files because my
> gluster cluster has a "distribute" of 3 volumes per each of the two servers)
> 
> A strace of unfs during one of the cd commands above outputs:
> 
> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
> revents=POLLIN|POLLRDNORM}])
> poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
> read(22,
> "\200\0\0\230B\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
> 4000) = 156
> tgkill(4574, 4576, SIGRT_1)             = 0
> tgkill(4574, 4575, SIGRT_1)             = 0
> futex(0x7fff31c7cb20, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
> setresgid(-1, 0, -1)                    = 0
> tgkill(4574, 4576, SIGRT_1)             = 0
> tgkill(4574, 4575, SIGRT_1)             = 0
> setresuid(-1, 0, -1)                    = 0
> write(22, "\200\0\0
> B\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
> revents=POLLIN|POLLRDNORM}])
> poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
> read(22,
> "\200\0\0\230C\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
> 4000) = 156
> tgkill(4574, 4576, SIGRT_1)             = 0
> tgkill(4574, 4575, SIGRT_1)             = 0
> setresgid(-1, 0, -1)                    = 0
> tgkill(4574, 4576, SIGRT_1)             = 0
> tgkill(4574, 4575, SIGRT_1)             = 0
> setresuid(-1, 0, -1)                    = 0
> write(22, "\200\0\0
> C\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000 <unfinished ...>
> 
> With the booster.fstab debug level set a debug, this is all that shows up in
> the log file:
> 
> [2009-07-23 02:52:16] D
> [libglusterfsclient-dentry.c:381:libgf_client_path_lookup]
> libglusterfsclient: resolved path(/) to 1/1
> [2009-07-23 02:52:17] D [libglusterfsclient.c:1340:libgf_vmp_search_entry]
> libglusterfsclient: VMP Entry found: /store.booster/: /store.booster/
> 
> my /etc/booster.conf
> 
> /home/gluster/apps/glusterfs-2.0.3/etc/glusterfs/liam.conf /store.booster/
> glusterfs
> subvolume=d,logfile=/home/gluster/apps/glusterfs-2.0.3/var/log/glusterfs/d.log,loglevel=DEBUG,attr_timeout=0
> 
> my /etc/exports
> 
> /store.booster myclient(rw,no_root_squash)
> 
> my client gluster config (liam.conf):
> 
> volume brick1a
>   type protocol/client
>   option transport-type tcp
>   option remote-host server1
>   option remote-subvolume brick1a
> end-volume
> 
> volume brick1b
>   type protocol/client
>   option transport-type tcp
>   option remote-host server1
>   option remote-subvolume brick1b
> end-volume
> 
> volume brick1c
>   type protocol/client
>   option transport-type tcp
>   option remote-host server1
>   option remote-subvolume brick1c
> end-volume
> 
> volume brick2a
>   type protocol/client
>   option transport-type tcp
>   option remote-host server2
>   option remote-subvolume brick2a
> end-volume
> 
> volume brick2b
>   type protocol/client
>   option transport-type tcp
>   option remote-host server2
>   option remote-subvolume brick2b
> end-volume
> 
> volume brick2c
>   type protocol/client
>   option transport-type tcp
>   option remote-host server2
>   option remote-subvolume brick2c
> end-volume
> 
> volume bricks1
>   type cluster/replicate
>   subvolumes brick1a brick2a
> end-volume
> 
> volume bricks2
>   type cluster/replicate
>   subvolumes brick1b brick2b
> end-volume
> 
> volume bricks3
>   type cluster/replicate
>   subvolumes brick1c brick2c
> end-volume
> 
> volume distribute
>   type cluster/distribute
>   subvolumes bricks1 bricks2 bricks3
> end-volume
> 
> volume readahead
>   type performance/read-ahead
>   option page-size 2MB     # unit in bytes
>   option page-count 16       # cache per file  = (page-count x page-size)
>   subvolumes distribute
> end-volume
> 
> volume cache
>   type performance/io-cache
>   option cache-size 256MB
>   subvolumes readahead
> end-volume
> 
> volume d
>   type performance/write-behind
>   option cache-size 16MB
>   option flush-behind on
>   subvolumes cache
> end-volume
> 
> I've tried removing the performance translators with no change.  Once i
> remove distribute and only connect to one of the three bricks on a server it
> works perfect.
> 
> I do have similar cluster that uses replicate but no distribute and it
> works fine.
> 
> ideas? This a bug?
> 
> thanks,
> liam
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users