When those processes become blocked are the drives busy or idle?
Can you post the output from "ps -awexo pid,tt,user,fname,tmout,f,wchan” on those processes when that happens?
My guess would be they really are waiting for the disk array for some reason - can you check if you can read/write to the OSD partitions when this happens? iostat output? And not sure what your HBA is but sometimes very bad things happen when you saturate the drives and cache completely, like lowering the LUN queue depth to 1 (which completely kills IO) or even dropping commands (but that would likely show in dmesg). This situation is often masked by driver completely. And ceph is really good at saturating drives.
Also how much memory does your machine have? vm.min_free_kbytes = 2640322 looks pretty high to me and it could block anything anytime if kswapd kicks in and starts cleaning pages. (But the whole system would be unusable so you’d likely notice that)
Jan
Hello all,
When I try to add more than one osds to a host and the backfilling process starts , all the osd daemons except one of them become stuck in D state. When this happends they are shown as out and down (when running ceph osd tree).
The only way I can kill the processes is to remove the osds from crushmap , then run kill -9 on them and then wait for a couple of minutes. There are no exception messages in osd logs and the dmesg looks ok too (nothing out of the ordinary). I run ceph firefly 0.80.10 on Ubuntu 14.04 (linux 3.13). The osds are running on RAID0 LUNs (2 drives for every diskgroup) created on a Dell MD3000 array with Hitachi hard drives (450 GB , 15K RPM). The issue happens even with 2 or 3 osds active on the host. I have only 1 Gb/s link to the host. Could the network bandwidth be the issue ?
The settings from sysctl.conf:
net.core.netdev_max_backlog = 250000 net.core.optmem_max = 16777216 net.core.rmem_default = 16777216 net.core.wmem_default = 16777216 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_mem = 16777216 16777216 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 87380 16777216 net.ipv4.tcp_low_latency = 1 net.ipv4.tcp_sack = 0 net.ipv4.tcp_timestamps = 0 net.ipv4.conf.default.rp_filter = 0 net.ipv4.conf.all.rp_filter = 0 net.ipv4.ip_forward = 1 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_tw_reuse = 0 net.ipv4.tcp_window_scaling = 0 net.ipv4.route.flush=1 vm.min_free_kbytes = 2640322 vm.swappiness = 0 vm.overcommit_memory = 1 vm.oom_kill_allocating_task = 0 vm.dirty_expire_centisecs = 360000 vm.dirty_writeback_centisecs = 360000 kernel.pid_max = 4194303 fs.file-max = 16815744 vm.dirty_ratio = 99 vm.dirty_background_ratio = 99 vm.vfs_cache_pressure = 100
Thanks, Simion Rad.
_______________________________________________ceph-users mailing listceph-users@xxxxxxxxxxxxxxhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com