Re: [RFC] Add sysctl option to drop disk flushes in bcache? (was: Bcache in writes direct with fsync)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 31 May 2022, Keith Busch wrote:
> On Tue, May 31, 2022 at 12:42:49PM -0700, Eric Wheeler wrote:
> > On Sat, 28 May 2022, Keith Busch wrote:
> > > On Sat, May 28, 2022 at 12:57:26PM +0000, Adriano Silva wrote:
> > > > Dear Christoph,
> > > > 
> > > > > Once you do that, the block layer ignores all flushes and FUA bits, so
> > > > > yes it is going to be a lot faster.  But also completely unsafe because
> > > > > it does not provide any data durability guarantees.
> > > > 
> > > > Sorry, but wouldn't it be the other way around? Or did I really not 
> > > > understand your answer?
> > > > 
> > > > Sorry, I don't know anything about kernel code, but wouldn't it be the 
> > > > other way around?
> > > > 
> > > > It's just that, I may not be understanding. And it's likely that I'm 
> > > > not, because you understand more about this, I'm new to this subject. 
> > > > I know very little about it, or almost nothing.
> > > > 
> > > > But it's just that I've read the opposite about it.
> > > > 
> > > >  Isn't "write through" to provide more secure writes?
> > > > 
> > > > I also see that "write back" would be meant to be faster. No?
> > > 
> > > The sysfs "write_cache" attribute just controls what the kernel does. It
> > > doesn't change any hardware settings.
> > > 
> > > In "write back" mode, a sync write will have FUA set, which will generally be
> > > slower than a write without FUA. In "write through" mode, the kernel doesn't
> > > set FUA so the data may not be durable after the completion if the controller
> > > is using a volatile write cache.
> > 
> > Something seems wrong here: Typically on a RAID controller LUN 
> > configuration "writeback" means that the non-volatile cache is active so 
> > "write back caching" is enabled.
> > 
> > According to https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt:
> > 
> > 	"When read, this file will display whether the device has write
> > 	back caching enabled or not. It will return "write back" for the former
> > 	case, and "write through" for the latter."
> > 
> > If my text mailer would underline then I would underline this from the 
> > documentation: "whether the device has write back caching enabled or not"
> 
> Maybe this is confusing because we let the user change the kernel's behavior
> regardless of how the storage device is configured?

This is important to keep, not all controllers properly report the LUN's 
cache state, so overrides are necessary in real life...but that's not what 
we're hoping to address:
  
> > Is there a good explanation for why the kernel setting is exactly 
> > _opposite_ of the controller setting?
> 
> By default, the drivers should have the correct setting reported for their
> devices, not the opposite. The user can override the sysfs attribute to the
> opposite setting though, so it's not necessarily an accurate report of what the
> device has actually enabled.

Lets assume for the moment that drivers always set this flag correctly 
because that isn't really the issue here: This is a discussion of 
terminology.

What I mean is that the very term "write-through" means to write _through_ 
the cache and block until completion to persistent storage, whereas, 
"write-back" means to return completion to the OS before IOs reach 
persistent storage.

...or at least this is the terminology that the RAID card manufacturers 
have used for decades.  I actually checked Wikipedia (as a zeitgeist 
reference, not as an authority) just in case I've been mistaken all these 
years as to the spirit of the meaning and it aligns with what I'm trying 
to express here:

  * Write-through: write is done synchronously both to the cache and to 
    the backing store.

  * Write-back (also called write-behind): initially, writing is done only 
    to the cache. The write to the backing store is postponed until the 
    modified content is about to be replaced by another cache block.
  [ https://en.wikipedia.org/wiki/Cache_(computing)#Writing_policies ]


So the kernel's notion of "write through" meaning "Drop FLUSH/FUA" sounds 
like the industry meaning of "write-back" as defined above; conversely, 
the kernel's notion of "write back" sounds like the industry definition of 
"write-through"

Is there a well-meaning rationale for the kernel's concept of "write 
through" to be different than what end users have been conditioned to 
understand?

--
Eric Wheeler

 

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux