Re: Unsolved questions

Konstantin Shalygin <k0ste@xxxxxxxx> · Tue, 7 Feb 2017 10:38:13 +0700



      1) Every once in a while, some processes (PHP) accessing the filesystem 
get stuck in a D-state (Uninterruptable sleep). I wonder if this happens 
due to network fluctuations (both server are connected via a simple 
Gigabit crosslink cable) or how to diagnose this. Why exactly does this 
happen in the first place? And what is the proper way to get these 
processes out of this situation? Why doesnt a timeout happen or anything 
else? I've read about client eviction, but when I enter "ceph daemon 
mds.node1 session ls" I only see two "entries" - one for each server. 
But I don't want to evict all processes on the server, obviously. Only 
the stuck process. So far, the only method I found to remove the D 
process is to reboot. Which is of course not a great solution. When I 
tried to only restart the MDS service instead of rebooting, many more 
processes got stuck and the load was >500 (not CPU most probably but due 
to processes waiting for I/O).
    
    Because - PHP. We have many php scripts on VM, runned by cron -
    parsers.

    This machine die. The question is only -
      when this happens. Usually reboot when
        LA like 200-350.

        I think this because some Main PHP PID is dead - ioctl() is
        newer return answer to child process = D state.
  

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com