Hello Coly! Am Do., 22. Apr. 2021 um 18:05 Uhr schrieb Coly Li <colyli@xxxxxxx>: > In direct I/Os, to read the just-written data, the reader must wait and > make sure the previous write complete, then the reading data should be > the previous written content. If not, that's bcache bug. Isn't this report exactly about that? DIO data has been written, then differently written again with a concurrent process, and when you read it back, any of both may come back (let's call it state A). But the problem here is that this is not persistent, and that should actually not happen: bcache now has stale content in its cache, and after write back finished, the contents of the previous read (from state A) changed to a new state B. And this is not what you should expect from direct IO: The contents have literally changed under your feet with a much too high latency: If some read already confirmed that data has some state A after concurrent writes, it should not change to a state B after bcache finished write-back. > You may try the above steps on non-bcache block devices with/without > file systems, it is probably to reproduce similar "race" with parallel > direct read and writes. I'm guessing the bcache results would suggest there's a much higher latency of inconsistency between write and read races, in the range of minutes or even hours. So there'd be no chance to properly verify your DIO writes by the following read and be sure that this state persists - just because there might be outstanding bcache dirty data. I wonder if this is why I'm seeing btrfs corructions with bcache when I enabled auto-defrag in btrfs. OTOH, I didn't check the code on how auto-defrag is actually implemented and if it uses some direct-io path under the hoods. Regards, Kai