Re: xfs and swift

Mark Seger <mjseger@xxxxxxxxx> · Mon, 25 Jan 2016 14:00:52 -0500

hey bernd, long time no chat.  it turns out you don't have to know what swift is because I've been able to demonstrate this behavior with a very simple python script that simply creates files in a 3-tier hierarchy.  the third level directories each contain a single file which for my testing are all 1K.
I have played wiht cache_pressure and it doesn't seem to make a difference, though that was awhlle ago and perhaps it is worth revisiting. one thing you may get a hoot out of, being a collectl user, is I have an xfs plugin that lets you look at a ton of xfs stats either in realtime or after the fact just like any other collectl stat.  I just havent' added it to the kit yet.

-mark

On Mon, Jan 25, 2016 at 1:24 PM, Bernd Schubert <bschubert@xxxxxxx> wrote:
Hi Mark!

On 01/06/2016 04:15 PM, Mark Seger wrote:

> I've recently found the performance our development swift system is

> degrading over time as the number of objects/files increases.  This is a

> relatively small system, each server has 3 400GB disks.  The system I'm

> currently looking at has about 70GB tied up in slabs alone, close to 55GB

> in xfs inodes and ili, and about 2GB free.  The kernel

> is 3.14.57-1-amd64-hlinux.

>

> Here's the way the filesystems are mounted:

>

> /dev/sdb1 on /srv/node/disk0 type xfs

> (rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=512,swidth=1536,noquota)

>

> I can do about 2000 1K file creates/sec when running 2 minute PUT tests at

> 100 threads.  If I repeat that tests for multiple hours, I see the number

> of IOPS steadily decreasing to about 770 and the very next run it drops to

> 260 and continues to fall from there.  This happens at about 12M files.

>

> The directory structure is 2 tiered, with 1000 directories per tier so we

> can have about 1M of them, though they don't currently all exist.

This sounds pretty much like hash directories as used by some parallel

file systems (Lustre and in the past BeeGFS). For us the file create

slow down was due to lookup in directories if a file with the same name

already exists. At least for ext4 it was rather easy to demonstrate that

simply caching directory blocks would eliminate that issue.

We then considered working on a better kernel cache, but in the end

simply found a way to get rid of such a simple directory structure in

BeeGFS and changed it to a more complex layout, but with less random

access and so we could eliminate the main reason for the slow down.

Now I have no idea what a "swift system" is and in which order it

creates and accesses those files and if it would be possible to change

the access pattern. One thing you might try and which should work much

better since 3.11 is the vfs_cache_pressure setting. The lower it is the

less dentries/inodes are dropped from cache when pages are needed for

file data.

Cheers,

Bernd

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs