Re: NVRAM support

Neil Brown <neilb@xxxxxxx> · Thu, 16 Feb 2006 10:00:30 +1100

On Wednesday February 15, mirko.benz@xxxxxx wrote:
> Hi,
> 
> My intention was not to use a NVRAM device for swap.
> 
> Enterprise storage systems use NVRAM for better data protection/faster 
> recovery in case of a crash.
> Modern CPUs can do RAID calculation very fast. But Linux RAID is 
> vulnerable when a crash during a write operation occurs.
> E.g. Data and parity write requests are issued in parallel but only one 
> finishes. This will
> lead to inconsistent data. It will be undetected and can not be 
> repaired. Right?

Wrong.  Well, maybe 5% right.

If the array is degraded, that the inconsistency cannot be detected.
If the array is fully functioning, then any inconsistency will be
corrected by a 'resync'.

> 
> How can journaling be implemented within linux-raid?

With a fair bit of work. :-)

> 
> I have seen a paper that tries this in cooperation with a file system:
> „Journal-guided Resynchronization for Software RAID“
> www.cs.wisc.edu/adsl/Publications

This is using the ext3 journal to make the 'resync' (mentioned above)
faster.  Write-intent bitmaps can achieve similar speedups with
different costs.

> 
> But I would rather see a solution within md so that other file systems 
> or LVM can be used on top of md.

Currently there is no solution to the "crash while writing and
degraded on restart means possible silent data corruption" problem.
However is it, in reality, a very small problem (unless you regularly
run with a degraded array - don't do that).

The only practical fix at the filesystem level is, as you suggest,
journalling to NVRAM.  There is work underway to restructure md/raid5
to be able to off-load the xor and raid6 calculations to dedicated
hardware.  This restructure would also make it a lot easier to journal
raid5 updates thus closing this hole (and also improving write
latency).

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html