On 2019/6/26 8:04 上午, Eric Wheeler wrote: > On Tue, 25 Jun 2019, Marc Smith wrote: > >> Hi, >> >> I've been experimenting using bcache and MD RAID on Linux 4.14.91. I >> have a 12-disk RAID6 MD array as the backing device, and a decent NVMe >> SSD as the caching device. I'm testing using write-back mode. >> >> I've been able to tune the sequential_cutoff so when issuing full >> stripe writes to the bcache device, these bypass hitting the cache >> device and go right into the MD RAID6 array, which seems to be working >> nicely. >> >> In the next experiment, when performing more random / sequential >> (mixed) writes, the cache device does a nice job of keeping up >> performance. However, when watching the data get flushed from the >> cache device to the backing device (the MD RAID6 volume), it doesn't >> seem the data is being written out as mostly full stripe writes. I get >> a lot of RMW's on the drives, so I don't believe I'm seeing these full >> stripe writes. I was sort of hoping/expecting bcache to do some >> re-ordering with this... there seem to be some knobs in bcache where >> it detects the full stripe size, and it knows partial stripe writes >> are expensive. >> >> So I guess my question is if it's known that the data is not >> re-ordered using full stripe geometry in bcache, or perhaps this is >> just a tunable that I'm not seeing? It seems bcache has access to this >> data, but maybe this is a future item where it could be implemented? >> >> The problem of course comes from the the sub-par performance when data >> is flushed from the cache device to the backing device... lots of >> read-modify-writes result in very poor write performance. If the I/O >> was pushed to the backing device as full stripe I/O's (or at least >> mostly) I'd expect to see better performance when flushing the cache. > > You could try turning up /sys/block/bcache0/bcache/writeback_percent . > Maybe there aren't enough contiguous regions in the writeback cache to > queue for write. > > Coly, > > Do you know how the nr_stripes, stripe_sectors_dirty and > full_dirty_stripes bitmaps work together to make a best-effort of writing > full stripes to the disk, and maybe you can explain under what > circumstances partial stripes would be written? Hi Eric, I don't have satisfied answer to the above question. But if upper layers don't issue I/Os with full stripe aligned, bcache cannot do anything more than merging adjacent blocks. But for random I/Os, only a few part of I/O requests can be merged, after writeback thread working for a while, almost all writeback I/Os are small and not stripe-aligned, even they are ordered by LBA address number. Thanks. -- Coly Li