Re: Filesystem corruption on RAID1

Gionatan Danti <g.danti@xxxxxxxxxx> · Thu, 17 Aug 2017 16:31:39 +0200

Il 17-08-2017 14:41 Roger Heflin ha scritto:

Here is a guess based on what you determined was the cause.

The mid-layer does not know the writes were lost.   The writes were in
the drives write cache (already submitted to the drive and confirmed
back to the mid-layer as done, even though they were not yet on the
platter), and when the driver lost power and "rebooted" those writes
disappeared, the write(s) the mid-layer had in progress and that never
got a done from the drive failed were retried and succeeded after the
driver reset was completed.

In high reliability raid the solution is to turn off that write cache,
*but* if you do direct io writes (most databases) with the drives
write cache off and no battery backed up cache between the 2 then the
drive becomes horribly slow since it must actually write the data to
the platter before telling the next level up that the data was safe.

Sure, disabling caching should at least greatly reduce the problem (torn 
writes remain a problem, but their are inevitable).

However, the entire idea of barriers/cache flushes/FUAs was to *safely 
enable* unprotected write caches, even in the face of powerloss. Indeed, 
for full-system powerloss their are adequate. However, device-level 
micro-powerlosses seem to pose an bigger threat to data reliability.

I suspect that the recurrent "my RAID1 array develops huge amount of 
mismatch_cnt sectors" question, which is often labeled as "don't worry 
about RAID1 mismatches", really has a strong tie with this specific 
problem.

I suggest anyone reading this list to also read the current thread on 
the linux-scsi list - it is very interesting.
Regards.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html