Re: RAID 10 far and offset on-disk layouts

Gionatan Danti <g.danti@xxxxxxxxxx> · Mon, 13 Jan 2014 09:52:50 +0100

Hi Neil,
let me recap from a previous message:

>FAR LAYOUT
>md(4) states:
>"The first copy of all data blocks will be striped across the early >part
>of all drives in RAID0 fashion, and then the next copy of all blocks
>will be striped across a later section of all drives, always ensuring
>that all copies of any given block are on different drives"
>
>The "on different drives" part let me wonder _how_ are chunks
>distributed. On a 4-disk array, I can imagine some different schemas:
>
>1)	A1 A2 A3 A4
>	.. .. .. ..
>	A4 A1 A2 A3
>
>2)	A1 A2 A3 A4
>	.. .. .. ..
>	A2 A1 A4 A3
>
>The first schema is the one depicted by SuSe documentation [1], while
>the second is the one described by Wikipedia [2].
>
>Question 1: as the two schema have different reliability
>characteristics, which is really used?

SuSe entry: 
https://www.suse.com/documentation/sles11/stor_admin/data/raidmdadmr10cpx.html#b7cynnk

Wikipedia entry: 
http://en.wikipedia.org/wiki/Linux_MD_RAID_10#LINUX-MD-RAID-10 (see how 
far layout is depicted)

Keld kindly told me that the SuSe is simply not updated, as it depict a 
situation changed with newer kernels. So my two questions:
1) from which kernel the layout is the one depicted by Wikipedia?
2) it is possible, using mdadm, check what "far" layout is in use?

From what I can see, a "mdadm --detail /dev/mdWHATEVER | grep Layout" 
tell me if using far vs near vs offset layout, but not the physical 
on-disk chunks organization (eg: far "type" 1 or 2).

Anyway, the thread started because I wonder why the OFFSET layout couple 
each disk to other two disks. Let me quote again:

>OFFSET LAYOUT
>md(4) states:
>"When 'offset' replicas are chosen, the multiple copies of a given >chunk
>are laid out on consecutive drives and at consecutive offsets.
>Effectively each stripe is duplicated and the copies are offset by one
>device."
>
>This means a schema like this:
>	
>3)	A1 A2 A3 A4
>	A4 A1 A2 A3
>	.. .. .. ..
>
>However, this is susceptible to any consecutive two-disk failures. A
>schema like
>
>4)	A1 A2 A3 A4
>	A2 A1 A4 A3
>
>would not suffer from this problem (eg: disk 2 & 3 can fail and the
>array is still working).
>
>Question 2: apart from simplicity, why the offset layout use the schema
>as n.3? I miss something?

Full thread link: http://marc.info/?t=138815504400002&r=1&w=2

Excuse me for the long email, I am simply trying to learn something :)
Thank you very much.

On 01/13/2014 12:20 AM, NeilBrown wrote:
On Thu, 09 Jan 2014 09:03:37 +0100 Gionatan Danti <g.danti@xxxxxxxxxx> wrote:

Interesting. Two question:
1) from which kernel the layout is the one depicted by Wikipedia?

Exactly what depiction in wikipedia are you referring to?  A link to the
image might help.

2) it is possible, using mdadm, check what "far" layout is in use?

mdadm --detail /dev/mdWHATEVER | grep Layout

I cannot answer that. Neil Brown should know.

Best regards
Keld
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hi all,
anyone with an update on these two questions?

I was thinking to use the kernel block trace facility to track disk
access and infer the on-disk data structure, but I haven't tried for now.

On the other hand, I carefully looked at mdadm output, without finding
anything related to physical block placing.

Look for "Layout".

NeilBrown

Any new advices on that regard?
Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html