Re: bad file access (bit-rot + AFR)

Venky Shankar <yknev.shankar@xxxxxxxxx> · Tue, 30 Jun 2015 11:16:45 +0530

On Tue, Jun 30, 2015 at 10:21 AM, Raghavendra Bhat <rabhat@xxxxxxxxxx> wrote:
> On 06/27/2015 03:28 PM, Venky Shankar wrote:
>>
>>
>>
>> On 06/27/2015 02:32 PM, Raghavendra Bhat wrote:
>>>
>>> Hi,
>>>
>>> There is a patch that is submitted for review to deny access to objects
>>> which are marked as bad by scrubber (i.e. the data of the object might have
>>> been corrupted in the backend).
>>>
>>> http://review.gluster.org/#/c/11126/10
>>> http://review.gluster.org/#/c/11389/4
>>>
>>> The above  2 patch sets solve the problem of denying access to the bad
>>> objects (they have passed regression and received a +1 from venky). But in
>>> our testing we found that there is a race window (depending upon the
>>> scrubber frequency the race window can be larger) where there is a
>>> possibility of self-heal daemon healing the contents of the bad file before
>>> scrubber can mark it as bad.
>>>
>>> I am not sure if the data truly gets corrupted in the backend, there is a
>>> chance of hitting this issue. But in our testing to simulate backend
>>> corruption we modify the contents of the file directly in the backend. Now
>>> in this case, before the scrubber can mark the object as bad, the self-heal
>>> daemon kicks in and heals the contents of the bad file to the good copy. Or
>>> before the scrubber marks the file as bad, if the client accesses it AFR
>>> finds that there is a mismatch in metadata (since we modified the contents
>>> of the file in the backend) and does data and metadata self-healing, thus
>>> copying the contents of the bad copy to good copy. And from now onwards the
>>> clients accessing that object always gets bad data.
>>
>>
>> I understand from Ravi (ranaraya@) that AFR-v2 would chose the "biggest"
>> file as the source, provided that afr xattrs are "clean" (AFR-v1 would give
>> back EIO). If a file is modified directly from the brick but leaves the size
>> unchanged, contents can be served from either copy. For self-heal to detect
>> anomalies, there needs to be verification (checksum/signature) at each stage
>> of it's operation. But this might be too heavy on the I/O side. We could
>> still cache mtime [but update on client I/O] after pre-check, but this still
>> would not catch bit flips (unless a filesystem scrub is done).
>>
>> Thoughts?
>>
>
> Yes. Even if wants to verify just before healing the file, the time taken to
> verify the checksum might be large if the file size is large. It might
> affect the self-heal performance.

Yes, but only when bitrot is enabled.

Probably this needs a bit more thinking.

>
> Regards,
> Raghavendra Bhat
>
>
>>>
>>> Pranith?Do you have any solution for this? Venky and me are trying to
>>> come up with a solution for this.
>>>
>>> But does this issue block the above patches in anyway? (Those 2 patches
>>> are still needed to deny access to objects once they are marked as bad by
>>> scrubber).
>>>
>>>
>>> Regards,
>>> Raghavendra Bhat
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel@xxxxxxxxxxx
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel@xxxxxxxxxxx
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel