Hi everyone,
I'm now running our two-node mini-cluster for some months. OSD, MDS and
Monitor is running on both nodes. Additionally there is a very small
third node which is only running a third monitor but no MDS/OSD. On both
main servers, CephFS is mounted via FSTab/Kernel driver. The mounted
folder is /var/www hosting many websites. We use Ceph in this situation
to achieve redundancy so we can easily switch over to the other node in
case one of them fails. Kernel version is 4.9.6. For the most part, it's
running great and the performance of the filesystem is very good. Only
some stubborn problems/questions have still remained over the whole time
and I'd like to settle them once and for all:
1) Every once in a while, some processes (PHP) accessing the filesystem
get stuck in a D-state (Uninterruptable sleep). I wonder if this happens
due to network fluctuations (both server are connected via a simple
Gigabit crosslink cable) or how to diagnose this. Why exactly does this
happen in the first place? And what is the proper way to get these
processes out of this situation? Why doesnt a timeout happen or anything
else? I've read about client eviction, but when I enter "ceph daemon
mds.node1 session ls" I only see two "entries" - one for each server.
But I don't want to evict all processes on the server, obviously. Only
the stuck process. So far, the only method I found to remove the D
process is to reboot. Which is of course not a great solution. When I
tried to only restart the MDS service instead of rebooting, many more
processes got stuck and the load was >500 (not CPU most probably but due
to processes waiting for I/O).
I found this thread here:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001513.html
Is this (still) relevant for my problem? And I read somewhere that you
should not mount the folder on the same server as the MDS is - except
you have a "newer" kernel (can't find where I've read this). The
information was a bit older, though, so I wondered if 4.9.6 isnt
sufficient or whether this is still a problem at all...
2) A second, also still unsolved problem: Most of the time "ceph health"
shows sth. like: "Client node2 failing to respond to cache pressure".
Restarting the mds removes this message for a while before it appears
again. I could remove the message by setting "mds cache size" higher
than the total number of files/folder on the whole filesystem. Which is
obviously not a great scalable solution. The message doesnt seem to
cause any problems, though. Nevertheless, I'd like to solve this. BTW:
When I run "session ls" I see the number of caps held (num_caps) very
high (80000). Doesnt this mean that so many files are open/occupied by
one ore more processes? Is this normal? I have some cronjobs running
from time to time which run find or chmod over the filesystem. Could
they be resposible for this? Is there some value to have Ceph release
those "caps" faster/earlier?
Thank you / BR
Ranjan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com