Re: Filesystem corruption on RAID1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Il 14-07-2017 02:32 Reindl Harald ha scritto:
because you won't be that happy when the kernel spits out a disk each
time a random SATA command times out - the 4 RAID10 disks on my
workstation are from 2011 and showed them too several times in the
past while they are just fine

here you go:
http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-timeouts/

Hi, so a premature/preventive drive detachment is not a silver bullet, and I buy it. However, I would at least expect this behavior to be configurable. Maybe it is, and I am missing something?

Anyway, what really surprise me is *not* the drive to not be detached, rather permitting that corruption make its way into real data. I naively expect that when a WRITE_QUEUED or CACHE_FLUSH command aborts/fails (which *will* cause data corruption if not properly handled) the I/O layer has the following possibilities:

a) retry the write/flush. You don't want to retry indefinitely, so the kernel need some type of counter/threshold; when the counter is reached, continue with b). This would mask out sporadic errors, while propagating recurring ones;

b) notify the upper layer that a write error happened. For synchronized and direct writes it can notify that by simply returning the correct exit code to the calling function. In this case, the block layer should return an error to the MD driver, which must act accordlying: for example, dropping the disk from the array.

c) do nothing. This seems to me by far the worst choice.

If b) is correcly implemented, it should prevent corruption to accumulate on the drives.

Please also note the *type* of corrupted data: not only user data, but filesystem journal and metadata also. The latter should be protected by the using of write barriers / FUAs, so they should be able to stop themselves *before* corruption.

So I have some very important questions:
- how does MD behave when flushing data to disk?
- does it propagate write barriers?
- when a write barrier fails, is the error propagated to the upper layers?

Thanks you all.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux