Re: Disappointing performance of copy (MD raid + XFS)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eric Sandeen wrote:
Gabor Gombas wrote:
Kristleifur Daðason wrote:
[CUT]

Thank you guys for your help

I have done further investigation.

I still have not checked how performances are with very small files and multiple simultaneous rsyncs.

I have checked the other problem I had which I was mentioning, that I couldn't go more than 150MB/sec even with large files and multiple simultaneous transfers. I confirm this one and I have narrowed the problem: two XFS defaults (optimizations) actually damage the performances.

The first and most important is the aligned writes: cat /proc/mounts lists this (autodetected) stripe size: "sunit=2048,swidth=28672" . My chunks are is 1MB and I have 16 disks in raid-6 so 14 data disks. Do you think it's correct? xfs_info lists blocks as 4k and sunit and swidth are in 4k blocks and have a very different value. Please do not use the same name "sunit"/"swidth" to mean 2 different things in 2 different places, it can confuse the user (me!)

Anyway that's not the problem: I have tried to specify other values in my mount (in particular I tried the values sunit and swidth should have had if blocks were 4k), but ANY xfs aligned mount kills the performances for me. I have to specify "noalign" in my mount to go fast. (Also note this option cannot be changed on mount -o remount. I have to unmount.)

The other default feature that kills performances for me is the rotorstep. I have to max it out at 255 in order to have good performances. Actually it is reasonable that a higher rotorstep should be faster... why is 1 the default? Why it even exists? With low values the await (iostat -x 1) increases, I guess because of the seeks, and stripe_cache_active stays higher, because there are less filled stripes.

If I use noalign and rotorstep at 255 I am able to go at 325 MB/sec on average (16 parallel transfers of 7MB files) while with defaults I go at about 90 MB/sec.

Also with noalign and rotorstep at 255 the stripe_cache_size stays usually in the lower half (below 16000 out of 32000) while with defaults it's stuck for most of the time at the maximum and processes are stuck sleeping in MD locks for this reason.

Do you have any knowledge of sunit/swidth alignment mechanism being broken on 2.6.31 or more specifically 2.6.31 ubuntu generic-14 ?

(Kristleifur thank you I have seen your mention of the Ubuntu vs vanilla kernel, I will try a vanilla one but right now I can't. However now I have narrowed the problem so XFS people might want to watch at the alignment problem more specifically)

Regarding my previous post I still would like to know what are those stack traces I posted in my previous post: what are the functions xlog_state_get_iclog_space+0xed/0x2d0 [xfs] and
xfs_buf_lock+0x1e/0x60 [xfs]
and what are they waiting for...
these are still the place where processes get stuck, even after having worked around the alignment/rotorstep problem...

And then a few questions on inode64:
- if I start using inode64, do I have to remember to use inode64 on every subsequent mount for the life for that filesystem? Or does it write it in some filesystem info region that the option has been used once, so it applies the inode64 by itself on subsequent mounts? - if I use a 64bit linux distro, will ALL userland programs automatically support 64bit inodes or do I have to continuously pay attention and risk to damage my data?

Thanks for your help
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux