Re: XFS/LVM/Multipath on a single RAID volume

Dave Hall <kdhall@xxxxxxxxxxxxxx> · Tue, 24 Feb 2015 17:04:35 -0500

Dave Hall
Binghamton University
kdhall@xxxxxxxxxxxxxx
607-760-2328 (Cell)
607-777-4641 (Office)

On 02/23/2015 06:18 AM, Emmanuel Florac wrote:
Le Sun, 22 Feb 2015 18:35:19 -0500
Dave Hall<kdhall@xxxxxxxxxxxxxx>  écrivait:

So since I have a fresh array that's not in production yet I was
hoping to get some pointers on how to configure it to maximize XFS
performance.  In particular, I've seen a suggestion that a
multipathed array should be sliced up into logical drives and pasted
back together with LVM.  Wondering also about putting the journal in
a separate logical drive on the same array.

What's the hardware configuration like? before multipathing, you need
to know if your RAID controller and disks can actually saturate your
link. Generally SAS-attached enclosures are driven through a 4 way
SFF-8088 cable, with a bandwidth of 4x 6Gbps (maximum throughput per
link: 3 GB/s) or 4 x 12 Gbps (max thruput: 6 GB/s).

The new hardware is an Infortrend with 16 x 2TB 6Gbps SAS drives.  It 
has one controller with dual 6Gbps SAS ports.  The server currently has 
two 3Gbps SAS HBAs.

On an existing array based on similar but slightly slower hardware, I'm 
getting miserable performance.  The bottleneck seems to be on the server 
side.  For specifics, the array is laid out as a single 26TB volume and 
attached by a single 3Gbps SAS.  The server is quad 8-core Xeon with 
128GB RAM.  The networking is all 10GB.  The application is rsnapshot 
which is essentially a series of rsync copies where the unchanged files 
are hard-linked from one snapshot to the next.  CPU utilization is very 
low and only a few cores seem to be active.  Yet the operation is taking 
hours to complete.

The premise that was presented to me by someone in the storage business 
is that with 'many' proccessor cores one should slice a large array up 
into segments, multipath the whole deal, and then mash the segments back 
together with LVM (or MD).  Since the kernel would ultimately see a 
bunch of smaller storage segments that were all getting activity, it 
should dispatch a set of cores for each storage segment and get the job 
done faster.  I think in theory this would even work to some extent on a 
single-path SAS connection.
I am able to set up a 2-way multipath right now, and I might be able
to justify adding a second controller to the array to get a 4-way
multipath going.

A multipath can double the throughput, provided that you have enough
drives: you'll need about 24 7k RPM drives to saturate _one_ 4x6Gbps
SAS link. If you have only 12 drives, dual attachment probably won't
yield much.

Even if the LVM approach is the wrong one, I clearly have a rare
chance to set this array up the right way.  Please let me know if you
have any suggestions.

In my experience, software RAID-0 with md gives slightly better
performance than LVM, though not much.

MD RAID-0 seems as likely as LVM, so I'd probably try that first.  The 
big question is how to size the slices of the array to make XFS happy 
and then how to make sure XFS knows about it.  Secondly, there is the 
question of the log volume.  Seems that with multipath there might be 
some possible advantage to putting this in it's on slice on the array so 
that log writes could be in an I/O stream that is managed separately 
from the rest.

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs