Re: RAID5 alignment issues with 4K/AF drives (WD green ones)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all, thanks for the tips I'll reply everyone in one aggregated message:
> Just a thought, but do you have the "XP mode" jumper removed on all drives?
Yes.

> Instead of doing a monster sequential write to find my disk speed, I
> generally find it more useful to add conv=fdatasync to a dd so that
> the dirty buffers are utilized as they are in most real-world working
> environments, but I don't get a result until the test is on-disk.
Done, same results (40 MB/s)

>>> My only suggestion would be to experiment with various partitioning,
>> 
>> 
>> Poster already said they're not partitioned.
> 
> Correct. using partitioning allows you to adjust the alignment, so for
> example if the MD superblock at the front moves the start of the
> exported MD device out of alignment with the base disks, you could
> compensate for it by starting your partition on the correct offset.
Done. I've created one big partition using parted with "-a optimal".
The partition layout is (fdisk friendly output):
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00077f06

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048  3907028991  1953513472   fd  Linux raid autodetect
Redone the test with the "conv=fdatasync" option as above: same results.

> My only suggestion would be to experiment with various partitioning,
> starting the first partition at 2048s or various points to see if you
> can find a placement that aligns the partitions properly. I'm sure
> there's an explanation, but I'm not in the mood to put on my thinking
> hat to figure it out at the moment. May also be worth using a
> different superblock version, as 1.2 is 4k from the start of the
> drives, which might be messing with alignment (although I would expect
> it on all arrays), worth trying the .9 which goes to the end of the
> device.
I've tried all the superblock versions 0, 0.9, 1, 1.1 and 1.2. Same results.

> No, those drives generally DON'T report 4k to the OS, even though they
> are. If they were, there'd be fewer problems. They lie and say 512b
> sectors for compatibility.
Yes they are dirty liars. It's the same also for the EADS series not only for the EARS ones.

> My recommendation would be to look into the stripe-cache settings and check
> iostat -x 5 output. What is most likely happening is that when writing to
> the raid5, it's reading some (to calculate parity most likely) and not just
> writing. iostat will confirm if this is indeed the case.
Could you explain how I could look into the stripe-cache settings?
This is one of many similar outputs from iostat -x 5 from the initial rebuilding phase:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00   13.29    0.00    0.00   86.71
Device: rrqm/s  wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda    6585.60    0.00 4439.20    0.00 44099.20     0.00    19.87     6.14  1.38    1.38    0.00  0.09 39.28
sdb    6280.40    0.00 4746.60    0.00 44108.00     0.00    18.59     5.20  1.10    1.10    0.00  0.07 35.04
sdc       0.00 9895.40    0.00 1120.80     0.00 44152.80    78.79    12.03 10.73    0.00   10.73  0.82 92.32
I also build a RAID6 (with one drive missing): same results.

> There must be some misalignment somewhere :(
Yes, it's the same behavior.

> Do all drives really report as 4K to the OS - physical_block_size, logical_block_size under
> /sys/block/sdX/queue/ ??
No they lie about the block size as you can see also in the fdisk output above.

> NB: how does it perform with partitions starting at sector 2048 (check
> all disks with fdisk -lu /dev/sdX).
They perform the same.

Any other suggestion?

I almost forgot: I've also booted OpenSolaris and I've created a zfs pool (aligned with 4k sector) from the same three drives and they perform very well, individually and together. I know that I'm comparing apples and oranges but ... there must be a solution!--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux