Re: Using xfs_growfs on SSD raid-10

Alexey Zilber <alexeyzilber@xxxxxxxxx> · Thu, 10 Jan 2013 11:50:42 +0800

Hi Stan,
  Please see in-line:

On Thu, Jan 10, 2013 at 11:21 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:

On 1/9/2013 7:23 PM, Alexey Zilber wrote:

> Hi All,

>

>   I've read the FAQ on 'How to calculate the correct sunit,swidth values

> for optimal performance' when setting up xfs on a RAID.  Thing is, I'm

> using LVM, and with the colo company we use, the easiest thing I've found,

> when adding more space, is to have another RAID added to the system, then

> I'll just pvcreate, expand the vgroup over it, lvextend and xfs_growfs and

> I'm done.  That is probably sub-optimal on an SSD raid.

>

> Here's the example situation.  I start off with a 6 (400GB) raid-10.  It's

> got 1M stripe sizes.  So I would start with pvcreate --dataalignment 1M

> /dev/sdb

> after all the lvm stuff I would do: mkfs.xfs -L mysql -d su=1m,sw=3

> /dev/mapper/db-mysql

> (so the above reflects the 3 active drives, and 1m stripe. So far so good?)

>

> Now, I need more space. We have a second raid-10 added, that's 4 (400gb)

> drives. So I do the same pvcreate --dataalignment 1M /dev/sdc

> then vgextend and lvextend, and finally; with xfs_growfs, there's no way to

> specify, change su/sw values.  So how do I do this?  I'd rather not use

> mount options, but is that the only way, and would that work?

It's now impossible to align the second array.  You have a couple of

options:

Only the sw=3 is no longer valid, correct?  There's no way to add sw=5?

1.  Mount with "noalign", but that only affects data, not journal writes

Is "noalign" the default when no sw/su option is specified at all?

2.  Forget LVM and use the old tried and true UNIX method of expansion:

 create a new XFS on the new array and simply mount it at a suitable

place in the tree.

Not a possible solution.  The space is for a database and must be contiguous.

3.  Add 2 SSDs to the new array and rebuild it as a 6 drive RAID10 to

match the current array.  This would be the obvious and preferred path,

How is this the obvious and preferred path when I still can't modify the sw value?  Same problem.  Data loss or reformatting is not the preferred path, it defeats the purpose of using LVM.  Also, the potential for data loss by enlarging the raid array is huge.

assuming you actually mean 1MB STRIP above, not 1MB stripe.  If you

Stripesize 1MB

actually mean 1MB hardware RAID stripe, then the controller would have

most likely made a 768KB stripe with 256KB strip, as 1MB isn't divisible

by 3.  Thus you've told LVM to ship 1MB writes to a device expecting

256KB writes.  In that case you've already horribly misaligned your LVM

volumes to the hardware stripe, and everything is FUBAR already.  You

probably want to verify all of your strip/stripe configuration before

moving forward.

I don't believe you're correct here.  The SSD Erase Block size for the drives we're using is 1MB.   Why does being divisible by 3 matter?  Because of the number of drives?  Nowhere online have a seen anything about a 768MB+256MB stripe.  All the performance info I've seen point to it being the fastest.  I'm sure that wouldn't be the case if the controller had to deal with two stripes.

So essentially, my take-away here is that xfs_growfs doesn't work properly when adding more logical raid drives?   What kind of a performance hit am I looking at if sw is wrong?  How about this.  If I know that the maximum number of drives I can add is say 20 in a RAID-10.  Can I format with sw=10 (even though sw should be 3) in the eventual expectation of expanding it?  What would be the downside of doing that?

Thanks!

--

Stan

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs