On Tue, 31 May 2022, Keith Busch wrote: > On Tue, May 31, 2022 at 12:42:49PM -0700, Eric Wheeler wrote: > > On Sat, 28 May 2022, Keith Busch wrote: > > > On Sat, May 28, 2022 at 12:57:26PM +0000, Adriano Silva wrote: > > > > Dear Christoph, > > > > > > > > > Once you do that, the block layer ignores all flushes and FUA bits, so > > > > > yes it is going to be a lot faster. But also completely unsafe because > > > > > it does not provide any data durability guarantees. > > > > > > > > Sorry, but wouldn't it be the other way around? Or did I really not > > > > understand your answer? > > > > > > > > Sorry, I don't know anything about kernel code, but wouldn't it be the > > > > other way around? > > > > > > > > It's just that, I may not be understanding. And it's likely that I'm > > > > not, because you understand more about this, I'm new to this subject. > > > > I know very little about it, or almost nothing. > > > > > > > > But it's just that I've read the opposite about it. > > > > > > > > Isn't "write through" to provide more secure writes? > > > > > > > > I also see that "write back" would be meant to be faster. No? > > > > > > The sysfs "write_cache" attribute just controls what the kernel does. It > > > doesn't change any hardware settings. > > > > > > In "write back" mode, a sync write will have FUA set, which will generally be > > > slower than a write without FUA. In "write through" mode, the kernel doesn't > > > set FUA so the data may not be durable after the completion if the controller > > > is using a volatile write cache. > > > > Something seems wrong here: Typically on a RAID controller LUN > > configuration "writeback" means that the non-volatile cache is active so > > "write back caching" is enabled. > > > > According to https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt: > > > > "When read, this file will display whether the device has write > > back caching enabled or not. It will return "write back" for the former > > case, and "write through" for the latter." > > > > If my text mailer would underline then I would underline this from the > > documentation: "whether the device has write back caching enabled or not" > > Maybe this is confusing because we let the user change the kernel's behavior > regardless of how the storage device is configured? This is important to keep, not all controllers properly report the LUN's cache state, so overrides are necessary in real life...but that's not what we're hoping to address: > > Is there a good explanation for why the kernel setting is exactly > > _opposite_ of the controller setting? > > By default, the drivers should have the correct setting reported for their > devices, not the opposite. The user can override the sysfs attribute to the > opposite setting though, so it's not necessarily an accurate report of what the > device has actually enabled. Lets assume for the moment that drivers always set this flag correctly because that isn't really the issue here: This is a discussion of terminology. What I mean is that the very term "write-through" means to write _through_ the cache and block until completion to persistent storage, whereas, "write-back" means to return completion to the OS before IOs reach persistent storage. ...or at least this is the terminology that the RAID card manufacturers have used for decades. I actually checked Wikipedia (as a zeitgeist reference, not as an authority) just in case I've been mistaken all these years as to the spirit of the meaning and it aligns with what I'm trying to express here: * Write-through: write is done synchronously both to the cache and to the backing store. * Write-back (also called write-behind): initially, writing is done only to the cache. The write to the backing store is postponed until the modified content is about to be replaced by another cache block. [ https://en.wikipedia.org/wiki/Cache_(computing)#Writing_policies ] So the kernel's notion of "write through" meaning "Drop FLUSH/FUA" sounds like the industry meaning of "write-back" as defined above; conversely, the kernel's notion of "write back" sounds like the industry definition of "write-through" Is there a well-meaning rationale for the kernel's concept of "write through" to be different than what end users have been conditioned to understand? -- Eric Wheeler