Re: Filesystem corruption on RAID1

Gionatan Danti <g.danti@xxxxxxxxxx> · Thu, 17 Aug 2017 23:21:38 +0200

Il 17-08-2017 23:01 Roger Heflin ha scritto:
But even if you figured out which it was, you would have no way to
know what writes were still sitting in the cache, it could be pretty
much any writes from the last few seconds (or longer depending on how
exactly the drive firmware works), and it would add additional
complexity to keep a list of recent writes to validate actually
happened in the case of an unexpected drive reset.  This is probably
more of a avoid this failure condition since this failure condition is
not a normal failure mode and more of a very rare failure mode.

Yes, but having identified the power-cycled disk, the system can not 
take the most sensible action.
For example, it can re-sync it with its mirror disk, basically treating 
it as a --add-spare action.
Or it can simply considering the disk as failing, kicking off it from 
the array and sending an alert email.

What the system should not do is doing nothing: as differences 
accumulates, reading from the array become non-deterministic. In other 
words, two reads can produce two different results, based on what disk 
was queried. This *will* cause all sort of problems.

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html