Re: Race Condition Leads to Corruption

Coly Li <colyli@xxxxxxx> · Fri, 23 Apr 2021 23:53:29 +0800

On 4/23/21 6:19 AM, Kai Krakow wrote:
> Hello Coly!
> 
> Am Do., 22. Apr. 2021 um 18:05 Uhr schrieb Coly Li <colyli@xxxxxxx>:
> 
>> In direct I/Os, to read the just-written data, the reader must wait and
>> make sure the previous write complete, then the reading data should be
>> the previous written content. If not, that's bcache bug.
> 
> Isn't this report exactly about that? DIO data has been written, then
> differently written again with a concurrent process, and when you read
> it back, any of both may come back (let's call it state A). But the
> problem here is that this is not persistent, and that should actually
> not happen: bcache now has stale content in its cache, and after write
> back finished, the contents of the previous read (from state A)
> changed to a new state B. And this is not what you should expect from
> direct IO: The contents have literally changed under your feet with a
> much too high latency: If some read already confirmed that data has
> some state A after concurrent writes, it should not change to a state
> B after bcache finished write-back.

Hi Kai,

Your comments make me have a better comprehension. Yes the staled key
continues to exist even after a reboot, it is problematic.

> 
>> You may try the above steps on non-bcache block devices with/without
>> file systems, it is probably to reproduce similar "race" with parallel
>> direct read and writes.
> 
> I'm guessing the bcache results would suggest there's a much higher
> latency of inconsistency between write and read races, in the range of
> minutes or even hours. So there'd be no chance to properly verify your
> DIO writes by the following read and be sure that this state persists
> - just because there might be outstanding bcache dirty data.
> 
> I wonder if this is why I'm seeing btrfs corructions with bcache when
> I enabled auto-defrag in btrfs. OTOH, I didn't check the code on how
> auto-defrag is actually implemented and if it uses some direct-io path
> under the hoods.

Hi Marc,

It seems that if the read miss hitting an on-flight writethrough I/O on
backing device, such read request should served without caching onto the
cache set.

Do you have a patch for the fix up ?

Thanks.

Coly Li