On Fri, May 18, 2007 at 03:49:14PM +0200, Mathieu Avila wrote: > Sorry for my late reply, > > I've performed the following tests with cluster-1.03: > - mount GFS on more than 1 node, using Gulm as the lock manager. > - cp'ing something big (a kernel) into it on each node, > - while it does that, manage to have the device returning I/O errors. > The result is not what you described: sometimes my "cp" finishes with > I/O errors (that's good), but most of the times it is blocked in the > kernel. I cannot perform any action, including umount. Syscalls like > "df" are blocked, too. > > I've done the same test with DLM and got the same results. Is there anything about "withdraw" in dmesg or /var/log/messages after you cause the i/o errors? If not, then the i/o errors are not being reported back to gfs for some reason. Perhaps there are some block/scsi drivers that don't properly return i/o errors to the fs? Once gfs sees i/o errors and does the withraw, it should usually work, although it does have problems occasionally. Dave -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster