Re: some general questions on RAID

Christoph Anton Mitterer <calestyo@xxxxxxxxxxxx> · Fri, 05 Jul 2013 01:34:42 +0200

On Thu, 2013-07-04 at 18:07 -0400, Phil Turmel wrote:
> Last time I checked, dmcrypt treated barriers as no-ops, so filesystems
> that rely on barriers for integrity can be scrambled.
Whow... uhmm... that would be awful... (since I already use ext4 on top
of dmcrypt..
But wouldn't that be a general problem of dmcrypt, unrelated to any
further stacking of LVM and/or MD?!

> As such, where I
> mix LVM and dmcrypt, I do it selectively on top of each LV.
Don't understand what you exactly mean/do and why it should help against
the barrier thingy?

> I believe dmcrypt is single-threaded, too.
I've had thought that, too,..at least it used to be.
So that would basically mean... if I put dmcrypt on top of MD, then one
single thread would handle the whole encryption for the whole MD and
therefore also on all of my (e.g. 4) devices?

But when I do it the other way round... MD being on top of dmcrypt...
than each physical device would get it's own dmcrypt device... and also
it's own thread, using potentially more CPUs?

Well the QNAP in question would have 2 cores with HT so 4 threads...
anyone here with an idea whether the performance boost would be worth
running dmcrypt below MD (which somehow sounds ugly and wrong).

> If either or both of those issues have been corrected, I wouldn't expect
> the layering order to matter.  I'd be nice if a lurking dmcrypt dev or
> enthusiast would chime in here.
I've mailed Milan Broz a pointer to this thread and hope he finds some
time to have a look at :)

If so, a question in addition,... if it's not already done,... are there
plans to make dmcrypt multi-threaded (so I could just wait for it and
put MD below it)?

> > But when looking at potential disaster recovery... I think not having MD
> > directly on top of the HDDs (especially having it above dmcrypt) seems
> > stupid.
> I don't know that layering matters much in that case, but I can think of
> many cases where it could complicate things.
What exactly do you mean?

My idea was that when MD is directly above the physical device... then I
will roughly which kind block should be where and how data blocks should
yield parity blocks... i.e. when I do disc forensics or plain dd access.
When dmcrypt is below, though,... all physical devices will look
completely like garbage.

> > 2) Chunks / Chunk size
> > a) How does MD work in that matter... is it that it _always_ reads
> > and/or writes FULL chunks?
>
> No.  It does not.  It doesn't go below 4k though.
So what does that mean exactly? It always reads/writes at least 4k
blocks?

> > Guess it must at least do so on _write_ for the RAID levels with parity
> > (5/6)... but what about read?
> No, not even for write.
:-O

> If an isolated 4k block is written to a raid6,
> the corresponding 4k blocks from the other data drives in that stripe
> are read, both corresponding parity blocks are computed, and the three
> blocks are written.
okay that's clear... but uhm... why having chuk sizes then? I mean
what's the difference when having a 128k chunk vs. a 256k one... when
the parity/data blocks seem to be split in 4k blocks,... or did I get
that completely wrong?

> > And what about read/write with the non-parity RAID levels (1, 0, 10,
> > linear)... is the chunk size of any real influence here (in terms of
> > reading/writing)?
> Not really.  At least, I've seen nothing on this list that shows any
> influence.
So AFAIU now:
a) Regardless the RAID level and regardless the chunk size,
   - data blocks are read/written in 4KiB blocks
   - when there IS parity information... then that parity information is _ALWAYS_ read/computed/written in 4KiB blocks.
b) The chunks basically just control how much consecutive data is on one
device, thereby allowing to speed up / slow down reads/write for small /
large files.
But that should basically only matter on seeking devices, i.e. not on
SSDs... thus the chunk size is irrelevant on SSDs...

Is all that right? Phil, Neil? :D

> > b) What's the currently suggested chunk size when having a undetermined
> > [snip]
> For parity raid, large chunk sizes are crazy, IMHO.  As I pointed out in
> another mail, I use 16k for all of mine.
Sounds contradicting to the 4 KiB parity blocks idea?! So why? Or do you
have by chance a URL to your other mail? :)

> > 3) Any extra benefit from the parity?
> > [snip]
> This capability exists as a separate userspace utility "raid6check" that
> is in the process of acceptance into the mdadm toolkit.
Interesting... just looking at it.

>   It is not built
> into the kernel, and Neil Brown has a long blog post explaining why it
> shouldn't ever be.
I'll search for it...

>   Built-in "check" scrubs will report such mismatches,
> and the built-in "repair" scrub fixes them by recomputing all parity
> from the data blocks.
So that basically means, that parity RAID (i.e. RAID6) *HAS* an
resilience advantage even over the 3 block copy RAID10 version that
we've discussed over there[0], right?

Thanks a lot,
Chris.

[0] http://thread.gmane.org/gmane.linux.raid/43405/focus=43407
<<attachment: smime.p7s>>