Hi everyone,
I hope someone can help me. I am have created a DRBD device with 2
servers and present this device to 2 other servers in a RedHat cluster
using 2 iSCSI paths from each DRBD server to each cluster node (i.e., 4
paths per cluster node, 8 paths in total). I then use multipath so that
each cluster node identifies the paths as belonging to the same device.
Finally, I create a GFS2 filesystem on the device. This was all going
very well and I was experimenting with different settings for the round
robin behaviour of multipath until I decided to carve the DRBD device
into smaller chunks. After some playing around I managed this, but now I
can only get the GFS2 system to mount properly on both cluster nodes if
the round robin switching parameter (rr_min_io) is set to 1000. I had
previously been able to use values of 100, 50, 2, 1 and many others, but
these settings now cause GFS2 to hang or refuse to mount. By looking
through the various mailing lists I have been able to update to kernel
2.6.18-162.el5 which has stopped the hanging, but the GFS2 system still
refuses to mount at times (multiple gfs2_fsck calls seem to help
sometimes here) and will withdraw after a few IOs (at least thats what
dmesg tells me). This is pure speculation, but I am wondering if there
are some timers I need to set to allow GFS2 to coordinate better with
lower rr_min_io. I'm happy to provide output, error messages, etc but
I'm not sure at this stage what would be useful.
Thanks in advance for any help. Kind regards, Mike O'S
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster