If the writeback cache is enabled (per the WCE bit in the Caching mode page), prudent software uses the FUA bit in WRITE commands when writing metadata and/or sends the SYNCHRONIZE CACHE command at important checkpoints to ensure the data is not going to be lost due to a power loss. Some database software is particularly prolific at sending these commands. Around 2003, many RAID controllers with non-volatile writeback caches honored the SYNCHRONIZE CACHE command, flushing the entire cache to the drives. This started causing timeouts as non-volatile write cache sizes grew. Recently, it's even causing trouble on individual disk drives with growing volatile write caches. The intent of software using these commands and bits was unclear - it could be: a) ensure data is in non-volatile cache (and will eventually be flushed) or on the medium; or b) ensure data is on the medium (so the drives are ready for removal). As a short-term fix, many RAID controllers assumed intent (a) and started interpreting the SYNCHRONIZE CACHE command as a NOP and ignoring the FUA bit. Surprise removal of a drive from a RAID controller is risky even if software has run SYNCHRONIZE CACHE, since the RAID controller might be doing other activity in the background. So, there are other reasons to justify assuming that the user just won't do that. Afraid of breaking software with intent (b) (which was more likely in the days of floppy disks, Bournelli Boxes, and other removable block devices), T10 chose to clarify that the original meaning was (b) and added new FUA_NV and SYNC_NV bits to let software express intent (a). The hope was that devices would implement the bits and software would start using them at appropriate times. Unfortunately, the short-term fix worked well enough that it still prevails today, and most standalone removable media block devices have disappeared. There is not much software actually sending the FUA_NV and SYNC_NV bits and few devices honoring the bits per the standard. As an SBC-3 letter ballot comment, I recently submitted T10 proposal 13-050 (see http://www.t10.org/doc13.htm) to obsolete the SYNC_NV and FUA_NV bits and change the meaning of the commands without those bits to intent (a), reflecting what the industry has actually done. -----Original Message----- From: linux-scsi-owner@xxxxxxxxxxxxxxx [mailto:linux-scsi-owner@xxxxxxxxxxxxxxx] On Behalf Of Jeremy Linton Sent: Tuesday, April 23, 2013 5:40 PM To: James Bottomley Cc: Ric Wheeler; linux-scsi@xxxxxxxxxxxxxxx; Martin K. Petersen; Jeff Moyer; Tejun Heo; Mike Snitzer; dgilbert@xxxxxxxxxxxx Subject: Re: T10 WCE interpretation in Linux & device level access On 4/23/2013 3:07 PM, James Bottomley wrote: > > I bet they don't; they probably obey the spec. There's a SYNC_NV bit > which if unset (which it is in our implementation) means only sync your > non-NV cache. For a device with all NV, that equates to nop. Yes, linux leaves the SYNC_NV bit unset in scsi_setup_flush_cmnd(). The draft specs, and a couple others I have laying about says: says the device shall sync cache to medium for both volatile and non volatile cache data if SYNC_NV is _unset_. With it set, the table could be more confusing! For volatile cache blocks with SYNC_NV set "If a non-volatile cache is present, then the device server shall synchronize to non-volatile cache or to the medium. If a non-volatile cache is not present, then the device server shall synchronize to the medium". And for Non-volatile cache with it set "No Requirement" Which to me says, don't expect any particular behavior if you set this bit and have NV it could flush to medium, flush to NV cache, or do nothing at all. But it seems pretty clear that with it unset its probably going to get synchronized to the medium. If T10 were to do something, maybe they could stop putting bits in the docs that aren't guaranteed to do anything (fill in rant). As for linux, seems the state of the spec really doesn't leave any good options other than provide the user the ability to disable the flush_cmnd() if the NV_SUP bit is set. Or maybe a white list (ick!)... -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html