James Bottomley wrote: > On Wed, 2005-06-15 at 10:34 -0500, Brian King wrote: > >>For scsi disks attached to an ipr adapter, the SYNCHRONIZE_CACHE command >>gets sent to the disk and works just like when attached to any other >>HBA. The ipr disk array devices, however, do not support the SYNC_CACHE command, >>nor do they support the caching mode page, so SYNC_CACHE never gets sent. >>So, the shutdown hook is needed to flush the adapter's battery backed write >>cache for all attached disk arrays on system shutdown. > > > Well, that means you have a whole lot more trouble in 2.6.12 than simply > failing to flush a cache on shutdown. The barrier code now uses cache > synchronization commands, so if you crash the on-disk image will not be > what a journalling filesystem expects. As long as the battery keeps the > information alive in the cache, I assume this corrects itself when ipr > next powers up, but if you trusted this, you wouldn't be fussing about > the shutdown cache flush, now would you? System crash and controlled power off are two different things. Yes, if the adapter is not told about the shutdown so it can flush its cache, the data will be maintained, but it can only be maintained for as long as the battery lasts. Currently, with 2.6.12-rc, if you do a normal power off and leave the system powered off for a long enough period of time (usually a week or so, depending on the battery on the card), you will suffer data loss. This is not acceptable. With 2.6.11, the adapter is told to flush the write cache on a controlled power off, so the system can remain powered off indefinitely with no data loss. Regarding supporting the sync cache command to a disk array, I did consider requesting support be added to the adapter microcode, but decided against it since it would make a RAID array perform horribly and it did not seem to be the appropriate model to follow for NV write cache. Some of the ipr adapters have over 500 MB of write cache, which can take an awful long time to flush, much longer than you would want for a journal barrier. So, all the barrier/ SYNC_CACHE code works great for a volatile write cache on a disk, but doesn't necessarily seem like the best model for large non-volatile write caches for RAID adapters. -- Brian King eServer Storage I/O IBM Linux Technology Center - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html