Re: RAID50, despite chunk setting, does everything in 4KB blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 19 Dec 2011 15:43:13 -0700 Chris Worley <worleys@xxxxxxxxx> wrote:

> It doesn't really matter what chunk sizes I set, but, for example, I
> create three RAID5's of 5 drives each with a chunk size of 32K, and
> create a RAID0 comprised of the three RAID5's with a chunk size of
> 64K:
> 
> md0 : active raid0 md27[2] md26[1] md25[0]
>       1885098048 blocks super 1.2 64k chunks
> 
> If I write to one of the RAID5's, using:
> 
> # dd of=/dev/md27  if=/dev/zero bs=1024k oflag=direct
> 
> ... then "iostat -dmx 2" shows the drives being written to in 32K
> chunks (avgrq-sz=64), as you'd expect.
> 
> But, writing to the RAID0 that's striping the RAID5's, shows
> everything being written in 4KB chunks (iostat shows avgrq-sz=8) to
> the RAID0 as well as to the RAID5's.

When writing to a RAID5 it *always* submits request to the lower layers in
PAGE sized units.  This makes it much easier to keep parity and data aligned.

The queue on the underlying device should sort the requests and  group them
together and your evidence suggests that it does.

When writing to the RAID5 through a RAID0 it will only see 64K at a time but
that shouldn't won't make any difference to its behaviour and should change
the way the requests finally get to the device.

So I have no idea why you see a difference.

I suspect lots of block-layer tracing, and lots of staring at code and lots
of head scratching would be needed to understand what is really going in.


> 
> Why is that?  Note that this is true for reading too.  Note I don't
> see the same problem when using RAID10 (via striped RAID1's) or
> RAID100 (via striped RAID10's).

RAID1 and RAID10 don't split things into pages so I can imagine that they
might life easier for the scheduler.

But the scheduler should  still get it right for RAID5 ....


So - its a mystery.  Sorry.

NeilBrown


> 
> ... this is on SLES11 using a 2.6.32.43-0.5 kernel.
> 
> Thanks,
> 
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux