This seems to happen about 50% of the time:
[root@wcarh035 ~]# ls /gluster/data
ls: cannot open directory /gluster/data: No such file or directory
[root@wcarh035 ~]# ls /gluster/data
00 06.fun 15 23.fun 32 40.fun 47 55.fun 64
00.fun 07 15.fun 24 32.fun 41 47.fun 56 64.fun
01 07.fun 16 24.fun 33 41.fun 50 56.fun 65
01.fun 10 16.fun 25 33.fun 42 50.fun 57 65.fun
02 10.fun 17 25.fun 34 42.fun 51 57.fun 66
02.fun 11 17.fun 26 34.fun 43 51.fun 60 66.fun
03 11.fun 20 26.fun 35 43.fun 52 60.fun 67
03.fun 12 20.fun 27 35.fun 44 52.fun 61 67.fun
04 12.fun 21 27.fun 36 44.fun 53 61.fun lost+found
04.fun 13 21.fun 30 36.fun 45 53.fun 62
05 13.fun 22 30.fun 37 45.fun 54 62.fun
05.fun 14 22.fun 31 37.fun 46 54.fun 63
06 14.fun 23 31.fun 40 46.fun 55 63.fun
If the mount is not up at the time of accessing the autofs directory,
then 50% of the time it takes 3 to 5 seconds for the directory listing
to show properly, and the other 50% of the time it takes the same 3 to 5
seconds but gives a "No such file or directory" error. This happens
whether a longer path (/gluster/data/44 for example) or just the top
level path is used. This happens whether autofs --ghost is used or not.
It seems like something might time out too soon if glusterfs takes too
long to start?
Here are the relevant autofs configurations:
[root@wcarh035 ~]# head -1 /etc/auto.master
/gluster /etc/glusterfs/auto.gluster --timeout=3600
[root@wcarh035 ~]# cat /etc/glusterfs/auto.gluster
data -fstype=glusterfs :/etc/glusterfs/gluster-data.vol
For gluster-data.vol, it is to a 3-node cluster/replicate cluster with
some of the performance/ modules activated.
Any suggestions?
I don't mind the autofs mount taking a few seconds to complete (although
if 3 to 5 seconds is unusual, perhaps I can fix that as well). I AM
concerned that if the autofs mount is used for the first time, or the
first time after a period of inactivity, that the request might
spuriously fail. This is bad. Is this AutoFS at fault or is it GlusterFS?
My current guess is that GlusterFS is saying the mount is complete to
AutoFS before the actual mount operation takes effect. 50% of the time
GlusterFS is able to complete the mount before AutoFS let's the user
continue, and all is well. The other 50% of the time, GlusterFS does not
quite finish the mount, and AutoFS gives the user a broken directory.
I might try and prove this by adding a sleep 5 to /sbin/mount.glusterfs,
although I do not consider this a valid solution, as it just reduces the
effect of the race - it does not eliminate the race.
Cheers,
mark
--
Mark Mielke<mark@xxxxxxxxx>