Re: Speeding up chunk size change?

Steven Haigh <netwiz@xxxxxxxxx> · Sun, 04 Mar 2012 11:56:39 +1100

On 4/03/2012 8:42 AM, Stan Hoeppner wrote:
On 3/3/2012 1:36 PM, Steven Haigh wrote:
Hi all,

I just wanted to run this past a few folk here as I want to make sure
I'm doing it the Right Way(tm).

I've decided to experiment with using a 128Kb chunk size on my RAID6
instead of a 64kb chunk.

Why?  Does your target application(s) perform better with a larger
chunk, and therefore larger total stripe size?  If you're strictly after
larger dd copy numbers then you're wasting everyone's time, including
yours, as such has almost zero bearing on real world performance, as
most workloads are far more random than sequential.

Purely experimental for fun and education. I actually thought that a 
reshape would go at somewhat near the resync speeds I get of 
~60-90Mb/sec. I guess this shows I'm wrong ;)

And apparently you're not using XFS.  This reshape will screw up your
alignment, and you'll need to change your fstab mount to reflect the new
RAID geometry.  But my guess is you're not using.  If you were you'd
probably be experienced enough to know that doubling your chunk size
isn't going to make much difference, if any, in real world system usage.

I do use XFS - but this machines role is a Xen Dom0 - so md2 holds the 
filesystems for the guest VMsin LVs. One of those guest filesystems is 
an LV of the VG on md2 formatted as XFS. It will be interesting to see 
how this affects things :)

I set a few 'optimisations' that I believe should help:
## Tweak the RAIDs
blockdev --setra 8192 /dev/sd[abcdefg]

Read-ahead is per file descriptor, and occurs at the filesystem level.
The read-ahead value used is that of the device immediately underlying
the filessytem.  So don't bother setting these above.

Interesting - I didn't think that was the case for whole disk arrays - 
but there you go... Learnt something else :)

blockdev --setra 8192 /dev/md0
blockdev --setra 8192 /dev/md1
blockdev --setra 16384 /dev/md2

This is fine.  You could theoretically set this to 1GB or more if you
always read entire files, with no ill effects, as read-ahead doesn't go
past EOF.  However if you do any mmap reads (many apps do) of portions
of large files, this will hammer performance, obviously, as you're
reading entire large files speculatively when not needed.  Play with
this at your own risk.

The workloads of the array (having LVM on top) for the VMs would 
probably make it quite random. This is part of the reason I am playing 
here - pure experimentation. I am very curious to see if it works better 
or worse after the reshape. I honestly don't know :)

echo 16384>  /sys/block/md2/md/stripe_cache_size

for i in sda sdb sdc sdd sde sdf; do
         echo "Setting options for $i"
         echo 256>  /sys/block/$i/queue/nr_requests
         echo 4096>  /sys/block/$i/queue/read_ahead_kb
Eliminate this line ^^^^

Any insight into why? I would have thought that this would help - 
however I'm not quite sure as to the values - as this is much less than 
one chunk... That also being said, wouldn't it be a good idea to have 
*some* readahead?

         echo 1>  /sys/block/$i/device/queue_depth
         echo deadline>  /sys/block/$i/queue/scheduler
done

Just wondering if anyone knows of any possible way to speed up the
reshape a little, or if (like I suspect) it will take ~2 days to
complete the reshape.

Considering how expensive such operations are in both time and wear on
the disk drives, it's better to read everything available to you on the
subject and ask questions *before* performing expensive experiments on
your array.  If you currently have an performance problem you're trying
to solve, the cause lay somewhere other than your chunk size.

As I said above, there really is no 'problem' I'm trying to solve. The 
whole reason is experimentation and education - really to see a 'what 
if' case. The last reshape I did on this array was a RAID5->RAID6 grow 
which went very well - however I have never experimented with chunk size 
on a mdadm raid.

--
Steven Haigh

Email: netwiz@xxxxxxxxx
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299

<<attachment: smime.p7s>>