On 1/29/19 12:14 PM, Werner Fischer wrote:
Hello all,
I'd like to ask whether it is necessary to switch the write cache of
HDDs and
SSDs (without power-loss-protection) to off when they are used for mdraid.
As discussed by Nik.Brt. and Song Liu last week, many storage devices
(HDDs/SSDs) "lie" when they indicate that the have written data. The
data is
only in the drive's cache, but not on magnetic disc or flash. "The disk's
embedded microcontroller may signal the main computer that a disk write is
complete immediately after receiving the write data, before the data is
actually
written to the platter." [1]
When used as a single disc, this can be handled with modern file
systems, as
they use write barriers. [2][3]
But what I'm not sure is, how this is handled by mdraid in case of a sudden
power loss. In the past I've recommended to disable the drive's write
cache by
using "hdparm -W0". This is also the default behavior of hardware raid
controllers. They switch off the drive cache of HDDs as they use their
internal
(battery-backed) cache.
So my questions is:
Is it save to keep the cache of HDDs and SSDs (without
power-loss-protection)
to on when used with mdraid?
My 2 cents:
I did some quick tests with a tiny bit better but still consumer grade
stuff (4x old WD red/purple drives and 4x new WD ssds). On sas
controller (non-raid) and with hdds further behind expander. 0
issues with that perl script (did 3 tests with each array, simultaneously).
The blog entry is very old and also explicity mentions that in ancient
times fsync() didn't request flush. Today it's of course not the case.
With supposedly so many problematic disks - wouldn't filesystem
journaling completely fall apart if flushes were not working correctly
(regardless whether it's flush or fua) ? Or a flush sent from within
e.g. VM. Or anything relying on fsync().
Another thing to consider is how much of the supposed issues are because
of the hw raid controllers and whatever they are trying to do/assume,
their time and available power constraints (IDK really) - while shifting
the blame to disks.
In theory battery backup (or equivalent functionality on "enterprise"
ssds) should let you get away without flush/fua (e.g. turning off
filesystem barriers). Whether you really can or should is another thing.
Test if in doubt (that diskchecker.pl is a nifty tool).