Large directory block size on XFS may be harmful

Jens Rosenboom <j.rosenboom@xxxxxxxx> · Thu, 18 Feb 2016 13:14:18 +0100

Various people have noticed performance problems and sporadic kernel
log messages like

kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)

with their Ceph clusters. We have seen this in one of our clusters
ourselves, but not been able to reproduce it in a lab environment
until recently. While trying to setup a benchmark for comparing the
effect of varying the bucket shard count, I suddenly started seeing
the same issues again and they seemed to be reproducible even with the
latest upstream kernel.

The test setup comprised 8 nodes with 2 SSDs as OSDs each. The
messages started to appear after writing 16kb sized objects with
cosbench using 32 workers for about 2 hours and soon after that the
OSDs started dying because of suicide timeout.

So we went ahead and tried running a kernel patched with [1], but this
had only partial success, so I posted these results to the XFS mailing
list. The response by Dave Chinner led to some important result:
Creating the file system with the option "-n size=64k" was the
culprit. Repeating the tests with sizes <=16k did not show any issues
and the performance for this particular test even turned out to be
better with simply letting the directory block size stay at the
default value of 4k.

In case you are seeing similar issues, you may want to check the
directory block size of your file system, you can use xfs_info for
that. The bad news is that the parameter cannot be changed for an
existing file system, so you will need to reformat everything.

And the morale is: Do not blindly trust configuration settings to be
helpful, even if their use seems to be widely spread and is looking
reasonable at first.

[1] http://oss.sgi.com/pipermail/xfs/2016-January/046308.html
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com