Hello! Am Do., 7. Feb. 2019 um 21:51 Uhr schrieb Nix <nix@xxxxxxxxxxxxx>: > btw I have ported ewheeler's ioprio-based cache hinting patch to 4.20; > I/O below the ioprio threshold bypasses everything, even metadata and > REQ_PRIO stuff. It was trivial, but I was able to spot and fix a tiny > bypass accounting bug in the patch in the process): see > http://www.esperi.org.uk/~nix/bundles/bcache-ioprio.bundle. (I figured > you didn't want almost exactly the same patch series as before posted to > the list, but I can do that if you prefer.) I compared this to my branch of the patches and cannot spot a difference: Where's the tiny bypass accounting bug you fixed? Here's my branch: https://github.com/kakra/linux/compare/master...kakra:rebase-4.20/bcache-updates > Semi-unrelated side note: after my most recent reboot, which involved a > bcache journal replay even though my shutdown was clean, the stats_total > reset; the cache device's bcache/written and > bcache/set/cache_available_percent also flipped to 0 and 100%,. I > suspect this is merely a stats bug of some sort, because the boot was > notably faster than before and cache_hits was about 6000 by the time it > was done. bcache/priority_stats *does* say that the cache is "only" 98% > unused, like it did before. Maybe cache_available_percent doesn't mean > what I thought it did. There's still a problem with bcache doing writebacks very very slowly, at only 4k/s. My system generates more than 4k/s writes thus it will eventually never finish writing back dirty data. This can become a huge pita if bcache dies for whatever reason - or you're loosing writes to bcache. FWIW, you really never ever want to loose the bcache with outstanding dirty data. So I added some udev rules to work around that until the idle detection correctly works and switches to fast writeback: $ cat /etc/udev/rules.d/01-bcache-writeback.rules ACTION=="add|change", KERNEL=="bcache*", ATTR{bcache/writeback_rate_minimum}="8192" This makes the writeback worker write at least 4 MB/s which should be no problem for spinning rust even when under stress. But YMMV, you may want to adjust that value. During normal workload, I now only see around 20-30 MB of dirty data maximum for a few seconds. Of course, these numbers depend a lot on your workload. But for me, I now consider this fairly safe. It has a good chance that btrfs can rewind to the previous transaction in case of a cache loss (tho, I'd consider this broken anyways then and would restore from backup, but I could still recover my latest work). This was somewhat battle tested because I lost my bcache twice mid-air due to a memory module gone bad. Before this change, I've seen bcache size not fully used and a lot less cache being evictable. Writing dirty data back in a timely manner makes space for new cache misses to be stored in the cache. Regards, Kai