On Mon, 2006-08-28 at 12:24 -0400, Theodore Tso wrote: > On Mon, Aug 28, 2006 at 08:49:36AM -0500, James Bottomley wrote: > > On Mon, 2006-08-28 at 09:31 -0400, Theodore Tso wrote: > > > IMHO the right thing is for the device driver to retry for some amount > > > of time (maybe measured in seconds or perhaps a single digit number of > > > minutes), and in the meantime, pass a signal to the rest of the kernel > > > that any process that attempt to write to the filesystem should be > > > frozen while we wait for the disk to come back. > > > > Actually, for this exact case, there's a feature propagating through the > > transport classes called the dev loss timer. It's job, for pluggable > > transports like FC, is to allow the user time to unplug and replug > > cables before the system declares the device lost and starts erroring > > requests (which is what causes the fs to go read only). Since the > > original reporter seemed to be using fibre, it sounds like this would > > suit. Beware: the dev loss timer shouldn't be much longer than the > > SCSI command timeout (say ~30s) or nasty things may happen. > > Yes, that sounds ideal. Does the dev loss timer need to be > configured, or is it going to be enabled with an appropriate- > for-most-systems defalut valaue (such as the SCSI command timeout). It's configurable via the fc transport class rports (in /sys/class/fc_rport_class, value dev_loss_tmo) the default value is 60s > Also, when did this get added to the various transport classes? I > assume it's not going to be of much help for the original reporter he > heeds it to work on a RHEL 3 AS Update 6 kernel, but hopefully it will > be in SLES 10 / RHEL 5? Or is this something that is just going into > the 2.6 mainline now? Erm, pass. It predates git, so at least 2.6.12-rc2 James _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users