On Fri, Apr 23, 2021 at 8:53 AM Coly Li <colyli@xxxxxxx> wrote: > > On 4/23/21 6:19 AM, Kai Krakow wrote: > > Hello Coly! > > > > Am Do., 22. Apr. 2021 um 18:05 Uhr schrieb Coly Li <colyli@xxxxxxx>: > > > >> In direct I/Os, to read the just-written data, the reader must wait and > >> make sure the previous write complete, then the reading data should be > >> the previous written content. If not, that's bcache bug. > > > > Isn't this report exactly about that? DIO data has been written, then > > differently written again with a concurrent process, and when you read > > it back, any of both may come back (let's call it state A). But the > > problem here is that this is not persistent, and that should actually > > not happen: bcache now has stale content in its cache, and after write > > back finished, the contents of the previous read (from state A) > > changed to a new state B. And this is not what you should expect from > > direct IO: The contents have literally changed under your feet with a > > much too high latency: If some read already confirmed that data has > > some state A after concurrent writes, it should not change to a state > > B after bcache finished write-back. > > Hi Kai, > > Your comments make me have a better comprehension. Yes the staled key > continues to exist even after a reboot, it is problematic. > > > > > >> You may try the above steps on non-bcache block devices with/without > >> file systems, it is probably to reproduce similar "race" with parallel > >> direct read and writes. > > > > I'm guessing the bcache results would suggest there's a much higher > > latency of inconsistency between write and read races, in the range of > > minutes or even hours. So there'd be no chance to properly verify your > > DIO writes by the following read and be sure that this state persists > > - just because there might be outstanding bcache dirty data. > > > > I wonder if this is why I'm seeing btrfs corructions with bcache when > > I enabled auto-defrag in btrfs. OTOH, I didn't check the code on how > > auto-defrag is actually implemented and if it uses some direct-io path > > under the hoods. > > Hi Marc, > > It seems that if the read miss hitting an on-flight writethrough I/O on > backing device, such read request should served without caching onto the > cache set. > > Do you have a patch for the fix up ? Yes, we do have a patch that we are testing and would like to be advised if it's the correct/acceptable approach. I'll post what we have shortly. --Marc > > Thanks. > > Coly Li