osd daemons stuck in D state

Simion Rad <Simion.Rad@xxxxxxxxx> · Mon, 27 Jul 2015 11:24:02 +0000

Hello all,

When I try to add more than one osds to a host and the backfilling process starts , all the osd daemons except one of them become stuck in D state. When this happends they are shown as out and down (when running ceph osd tree).

The only way I can kill the processes is to 

remove the osds  from crushmap , then run kill -9 on them and then wait for a couple of minutes.

There are no exception messages in osd logs and the dmesg looks ok too (nothing out of the ordinary).

I run ceph firefly 0.80.10 on Ubuntu 14.04 (linux 3.13).

The osds are running on RAID0 LUNs (2 drives for every diskgroup) created on a Dell MD3000 array with Hitachi hard drives (450 GB , 15K RPM).

The issue happens even with 2 or 3 osds active on the host.

I have only 1 Gb/s link to the host. Could the network bandwidth be the issue ? 

The settings from sysctl.conf:

net.core.netdev_max_backlog = 250000

net.core.optmem_max = 16777216

net.core.rmem_default = 16777216

net.core.wmem_default = 16777216

net.core.rmem_max = 16777216

net.core.wmem_max = 16777216

net.ipv4.tcp_mem = 16777216 16777216 16777216

net.ipv4.tcp_rmem = 4096 87380 16777216

net.ipv4.tcp_wmem = 4096 87380 16777216

net.ipv4.tcp_low_latency = 1

net.ipv4.tcp_sack = 0

net.ipv4.tcp_timestamps = 0

net.ipv4.conf.default.rp_filter = 0

net.ipv4.conf.all.rp_filter = 0

net.ipv4.ip_forward = 1

net.ipv4.tcp_tw_recycle = 0

net.ipv4.tcp_tw_reuse = 0

net.ipv4.tcp_window_scaling = 0

net.ipv4.route.flush=1

vm.min_free_kbytes = 2640322

vm.swappiness = 0

vm.overcommit_memory = 1

vm.oom_kill_allocating_task = 0

vm.dirty_expire_centisecs = 360000

vm.dirty_writeback_centisecs = 360000

kernel.pid_max = 4194303

fs.file-max = 16815744

vm.dirty_ratio = 99

vm.dirty_background_ratio = 99

vm.vfs_cache_pressure = 100

Thanks, 

Simion Rad.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com