Re: ag selection

Bernd Schubert <bernd.schubert@xxxxxxxxxxxxxxxxxx> · Mon, 11 Nov 2013 19:23:51 +0100

On 11/11/2013 06:55 PM, Carlos Maiolino wrote:
> On Mon, Nov 11, 2013 at 03:53:14PM -0200, Carlos Maiolino wrote:
>> On Mon, Nov 11, 2013 at 06:25:13PM +0100, Bernd Schubert wrote:
>>> Hi all,
>>>
>>> for streaming writes onto a raid6 the current round-robin ag
>>> selection seems does not seem to be optimal. Writing 4 files from 4
>>> threads into a single directory we get 900 MB/s, writing 4 files in
>>> 4 different directories we only get 700 MB/s (12 disks with with hw
>>> megaraid-sas). The current round-robin scheme seems to be optimized
>>> for linear raid0? With small AGs one could also argue, that choosing
>>> AGs which are not far away from each other (in respect to the number
>>> of blocks) also adds more parallel disk access for small and medium
>>> sized files.
>>>
>>> Any objections against a patch to improve the AG selection?
>>>
>>
>> I wouldn't say this it is optimized specifically for raid 0 environments but I
>> lack some knowledge on this choice. The mainly reason for the round-robing IIRC,
>> was to avoid lock contention in a single AG. spreading different files along the
>> whole disk, and also making it able to allocate them contiguously along the disk.
>>
> Lock contention in inodes and blocks B-Trees for example, improving parallelism
> in the filesystem, but of course this might not be the optimal behavior for all

Agreed, more locks help to avoid that.

> environments. That's why XFS has a long list of tuning mkfs/mount options :-)
> 
>> But, I'm not sure what kind of optimization you have in mind and I believe
>> another engineers will also need some extra information about what optimization
>> you have in mind, what kind of tests you're doing (Direct I/O, buffered,
>> pre-allocation), etc.. You'll also need to post filesystem configurations like
>> FS aligment (su, sw options), etc.

One of my colleagues benchmarked this on one of our fast systems and another 
colleague current needs this system for other tests, so I don't have the 
exact parameters. However, it was for sure formated with options like these:

mkfs.xfs -d su=256k,sw=10 -l version=2,su=256k -isize=512 /dev/sdX

and mounted with these options:

mount -onoatime,nodiratime,largeio,inode64,swalloc,allocsize=131072k,nobarrier /dev/sdX <mountpoint>

>>
>> For different write patterns, you might also want to take a look at the
>> rotor_step procfs option, and some other options dedicated to streaming writes,
>> that might help you in this case.

Thanks, I didn't know that knob, I'm going to look into it. 
According to the comments its for inode32 only, but I need 
to read the xfs_alloc code first to see what it actually 
does. 

Thanks,
Bernd

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs