Unsolved questions

Ranjan Ghosh <ghosh@xxxxxx> · Mon, 6 Feb 2017 11:54:23 +0100

Hi everyone,

I'm now running our two-node mini-cluster for some months. OSD, MDS and 
Monitor is running on both nodes. Additionally there is a very small 
third node which is only running a third monitor but no MDS/OSD. On both 
main servers, CephFS is mounted via FSTab/Kernel driver. The mounted 
folder is /var/www hosting many websites. We use Ceph in this situation 
to achieve redundancy so we can easily switch over to the other node in 
case one of them fails. Kernel version is 4.9.6. For the most part, it's 
running great and the performance of the filesystem is very good. Only 
some stubborn problems/questions have still remained over the whole time 
and I'd like to settle them once and for all:

1) Every once in a while, some processes (PHP) accessing the filesystem 
get stuck in a D-state (Uninterruptable sleep). I wonder if this happens 
due to network fluctuations (both server are connected via a simple 
Gigabit crosslink cable) or how to diagnose this. Why exactly does this 
happen in the first place? And what is the proper way to get these 
processes out of this situation? Why doesnt a timeout happen or anything 
else? I've read about client eviction, but when I enter "ceph daemon 
mds.node1 session ls" I only see two "entries" - one for each server. 
But I don't want to evict all processes on the server, obviously. Only 
the stuck process. So far, the only method I found to remove the D 
process is to reboot. Which is of course not a great solution. When I 
tried to only restart the MDS service instead of rebooting, many more 
processes got stuck and the load was >500 (not CPU most probably but due 
to processes waiting for I/O).

I found this thread here: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001513.html

Is this (still) relevant for my problem? And I read somewhere that you 
should not mount the folder on the same server as the MDS is - except 
you have a "newer" kernel (can't find where I've read this). The 
information was a bit older, though, so I wondered if 4.9.6 isnt 
sufficient or whether this is still a problem at all...

2) A second, also still unsolved problem: Most of the time "ceph health" 
shows sth. like: "Client node2 failing to respond to cache pressure". 
Restarting the mds removes this message for a while before it appears 
again. I could remove the message by setting "mds cache size" higher 
than the total number of files/folder on the whole filesystem. Which is 
obviously not a great scalable solution. The message doesnt seem to 
cause any problems, though. Nevertheless, I'd like to solve this. BTW: 
When I run "session ls" I see the number of caps held (num_caps) very 
high (80000). Doesnt this mean that so many files are open/occupied by 
one ore more processes? Is this normal? I have some cronjobs running 
from time to time which run find or chmod over the filesystem. Could 
they be resposible for this? Is there some value to have Ceph release 
those "caps" faster/earlier?

Thank you / BR

Ranjan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com