Re: [RFC] Add sysctl option to drop disk flushes in bcache? (was: Bcache in writes direct with fsync)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 24 May 2022, Keith Busch wrote:
> On Tue, May 24, 2022 at 01:14:18PM -0700, Eric Wheeler wrote:
> > Hi Christoph,
> > 
> > On Mon, 23 May 2022, Christoph Hellwig wrote:
> > > ... wait.
> > > 
> > > Can someone explain what this is all about?  Devices with power fail 
> > > protection will advertise that (using VWC flag in NVMe for example) and 
> > > we will never send flushes. So anything that explicitly disables flushed 
> > > will generally cause data corruption.
> > 
> > Adriano was getting 1.5ms sync-write ioping's to an NVMe through bcache 
> > (instead of the expected ~70us), so perhaps the NVMe flushes were killing 
> > performance if every write was also forcing an erase cycle.
> > 
> > The suggestion was to disable flushes in bcache as a troubleshooting step 
> > to see if that solved the problem, but with the warning that it could be 
> > unsafe.
> > 
> > Questions:
> > 
> > 1. If a user knows their disks have a non-volatile cache then is it safe 
> >    to drop flushes?
> > 
> > 2. If not, then under what circumstances is it unsafe with a non-volatile 
> >    cache?
> >   
> > 3. Since the block layer wont send flushes when the hardware reports that 
> >    the cache is non-volatile, then how do you query the device to make 
> >    sure it is reporting correctly?  For NVMe you can get VWC as:
> > 	nvme id-ctrl -H /dev/nvme0 |grep -A1 vwc
> >    
> >    ...but how do you query a block device (like a RAID LUN) to make sure 
> >    it is reporting a non-volatile cache correctly?
> 
> You can check the queue attribute, /sys/block/<disk>/queue/write_cache. If the
> value is "write through", then the device is reporting it doesn't have a
> volatile cache. If it is "write back", then it has a volatile cache.
 
Thanks, Keith!  

Is this flag influced at all when /sys/block/sdX/queue/scheduler is set 
to "none", or does the write_cache flag operate independently of the 
selected scheduler?

Does the block layer stop sending flushes at the first device in the stack 
that is set to "write back"?  For example, if a device mapper target is 
writeback will it strip flushes on the way to the backing device?

This confirms what I have suspected all along: We have an LSI MegaRAID 
SAS-3516 where the write policy is "write back" in the LUN, but the cache 
is flagged in Linux as write-through:

	]# cat /sys/block/sdb/queue/write_cache 
	write through

I guess this is the correct place to adjust that behavior!


--
Eric Wheeler




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux