On Tue, 23 Nov 2021, Coly Li wrote: > On 11/20/21 8:06 AM, Eric Wheeler wrote: > > Hi Coly, Kai, and Kent, I hope you are well! > > > > On Thu, 18 Nov 2021, Kai Krakow wrote: > > > >> Hi Coly! > >> > >> Reading the commit logs, it seems to come from using a non-default > >> block size, 512 in my case (although I'm pretty sure that *is* the > >> default on the affected system). I've checked: > >> ``` > >> dev.sectors_per_block 1 > >> dev.sectors_per_bucket 1024 > >> ``` > >> > >> The non-affected machines use 4k blocks (sectors per block = 8). > > If it is the cache device with 4k blocks, then this could be a known issue > > (perhaps) not directly related to the 5.15 release. We've hit a before: > > https://www.spinics.net/lists/linux-bcache/msg05983.html > > > > and I just talked to Frédéric Dumas this week who hit it too (cc'ed). > > His solution was to use manufacturer disk tools to change the cachedev's > > logical block size from 4k to 512-bytes and reformat (see below). > > > > We've not seen issues with the backing device using 4k blocks, but bcache > > doesn't always seem to make 4k-aligned IOs to the cachedev. It would be > > nice to find a long-term fix; more and more SSDs support 4k blocks, which > > is a nice x86 page-alignment and may provide for less CPU overhead. > > > > I think this was the last message on the subject from Kent and Coly: > > > > > On 2018/5/9 3:59 PM, Kent Overstreet wrote: > > > > Have you checked extent merging? > > > > > > Hi Kent, > > > > > > Not yet. Let me look into it. > > > > > > Thanks for the hint. > > > > > > Coly Li > > I tried and I still remember this, the headache is, I don't have a 4Kn SSD to > debug and trace, just looking at the code is hard... The scsi_debug driver can do it: modprobe scsi_debug sector_size=4096 dev_size_mb=$((128*1024)) That will give you a 128gb SCSI ram disk with 4k sectors. If that is enough for a cache to test against then you could run your super-high-IO test against it and see what you get. I would be curious how testing bcache on the scsi_debug ramdisk in writeback performs! > If anybody can send me (in China to Beijing) a 4Kn SSD to debug and testing, > maybe I can make some progress. Or can I configure the kernel to force a > specific non-4Kn SSD to only accept 4K aligned I/O ? I think the scsi_debug option above might be cheaper ;) But seriously, Frédéric who reported this error was using an Intel P3700 if someone (SUSE?) wants to fund testing on real hardware. <$150 used on eBay: I'm not sure how to format it 4k, but this is how Frédéric set it to 512 bytes and fixed his issue: # intelmas start -intelssd 0 -nvmeformat LBAFormat=0 # intelmas start -intelssd 1 -nvmeformat LBAFormat=0 -Eric > > Coly Li > > > >