On Wed, 27 Apr 2005 mike.miller@xxxxxx wrote:
Hello All, I have observed some behavior under certain failure conditions that seems as if the kernel may be ignoring write errors to disk. During very heavy read/write io if we force a disk to fail requests continue to be submitted until the controllers queue is full. Ultimately, the requests are timed out by the controller. When this happens we see filesystem corruption. Sometimes it's the file data, other times it's filesystem metadata that has been timed out and failed. Either way its obviously undesirable behavior. It looks like the OS/filesystem (ext2/3 and reiserfs) does not wait for for a successful completion. Is this assumption correct?
Thanks, mikem
It depends. Obviously if you disconnect your hard drive, the writes will fail with a time-out. But they fail after a number of retries (it depends upon the type of disk and its driver). So, if you "force" a timeout by disconnecting a drive, you don't have the same situtation as a normally failed write.
Disk/file writes go like this (assuming no sync() or fsync()).
(1) File data gets flushed to a queue. (2) When the queue gets nearly full, based upon a LRU mechanism, data are written to the disk. (3) If the disk-write fails, the driver retries the write. (4) If the write continues to fail, i.e., timeout, no disk, etc. the kernel gives up and does not hang forever. If you have disconnected the drive, you won't have any syslog writes to the device so your next boot won't show the event. It looks as though it was ignored.
You can observe the behavior by mounting a floppy disk and then removing it while it is being written. There are many attempts to write to the device and then that write is discarded.
Cheers, Dick Johnson Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips). Notice : All mail here is now cached for review by Dictator Bush. 98.36% of all statistics are fiction. - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html