On Wed, Nov 30, 2016 at 12:04:24PM -0800, L A Walsh wrote: > > > Eric Sandeen wrote: > > > >> But those systems also, sometimes, change runtime > >>behavior based on the UPS or battery state -- using write-back on > >>a full-healthy battery, or write-through when it wouldn't be safe. > >> > >> In that case, it seems nobarrier would be a better choice > >>for those volumes -- letting the controller decide. > > > >No. Because then xfs will /never/ send barriers requests, even > >if the battery dies. So I think you have that backwards. Let's just get somethign straight first - there is no "barrier" operation that is sent to the storage, and Linux does not have "barriers" anymore. What we now do is strictly order our IO at the filesystem level and issue cache flush requests to ensure all IO prior to the cache flush request is on stable storage. We also make use of FUA writes, which guarantee that a specific write hits stable storage before the filesystem is told that it is complete (FUA is emulated with post-IO cache flush requests on devices that don't support FUA). This is why "barriers" no longer have a performance cost - we don't need to empty the IO pipeline to guarantee integrity anymore. And it should be clear why hardware that has non-volatile caches don't care whether "barriers" are enabled or not because all writes are FUA and cache flushes are no-ops. IOWs, "barriers" are an outdated concept and we only still have it hanging around because we were stupid enough to name a mount option after an implementation, rather than the feature it provided. > --- > If the battery dies, then the controller shifts > to write-through and no longer uses its write cache. This is > documented and observed behavior. For /some/ RAID controllers in /some/ modes. For example, the megaraid driver that has been ignoring cache flushes for over 9 years because in RAID mode it doesn't need it. However, in JBOD mode, that same controller requires cache flushes to be sent because it turns off sane cache management behaviour in JBOD mode: ommit 1e793f6fc0db920400574211c48f9157a37e3945 Author: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> Date: Fri Oct 21 06:33:32 2016 -0700 scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices Commit 02b01e010afe ("megaraid_sas: return sync cache call with success") modified the driver to successfully complete SYNCHRONIZE_CACHE commands without passing them to the controller. Disk drive caches are only explicitly managed by controller firmware when operating in RAID mode. So this commit effectively disabled writeback cache flushing for any drives used in JBOD mode, leading to data integrity failures. This is a clear example of why "barriers" should always be on and cache flushes always passed through to the storage - because we just don't know WTF the storage is actually doing with it's caches. Quite frankly, I think it's time we marked the "barrier/nobarrier" mount options as deprecated and simply always issue the required cache flushes. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html