Vikas Gorur wrote:
I'd like to clarify the situation with the "mount --bind" issue you're facing.
When GlusterFS is started, it does two things in the following order:
1) daemonize itself (by calling daemon(3)).
2) Initialize the translator graph (and thus FUSE and mount).
This introduces a race since a shell command that runs glusterfs will return as
soon as glusterfs becomes a daemon. The next shell command (in a script) will
try something like "mount --bind", but glusterfs might not have finished initializing
fuse and mounting by then. This is why putting in a "sleep 2" in your script works,
because it gives glusterfs time to initialize and mount.
Unfortunately, the obvious solution of interchanging the order of these
two things does not work. The translator graph might contain an ib-verbs
transport module. The ib-verbs library expects that the process which
does ib-verbs init is also the process that does any further send/recv.
Calling daemon(), however, will spawn child processes and kill the parent,
thus "changing" the PID of GlusterFS.
The solution is to write a custom daemon() function which will wait
for the child to successfully initialize everything before exiting.
Or in the meantime, putting "sleep 5" in the right place in
mount.glusterfs would do just as nicely. Or maybe even breaking it out
to a mount parameter so that it can be manually adjusted to the minimum
required on whatever the particular hardware in question needs. e.g.
mount -t glusterfs -o wait=3 /etc/gluster/some.col /mnt/point
It's a bodge but at least it doesn't require hard-coding anywhere, and
it's a workaround until a real solution is implemented.
I can provide a patch that does this if there's interest.
Gordan