On Wed, Dec 21, 2016 at 11:10 PM, Stéphane Klein <contact@xxxxxxxxxxxxxxxxxxx> wrote: > > 2016-12-21 23:06 GMT+01:00 Ilya Dryomov <idryomov@xxxxxxxxx>: >> >> What's the output of "cat /proc/$(pidof rm)/stack? > > > root@ceph-client-3:/home/vagrant# cat /proc/2315/stack > [<ffffffff8115313e>] sleep_on_page+0xe/0x20 > [<ffffffff81152eff>] wait_on_page_bit+0x7f/0x90 > [<ffffffff81162b6e>] truncate_inode_pages_range+0x2fe/0x5a0 > [<ffffffff81162e25>] truncate_inode_pages+0x15/0x20 > [<ffffffff8124b27e>] ext4_evict_inode+0x12e/0x510 > [<ffffffff811ddb50>] evict+0xb0/0x1b0 > [<ffffffff811de365>] iput+0xf5/0x180 > [<ffffffff811d2cde>] do_unlinkat+0x18e/0x2b0 > [<ffffffff811d3c0b>] SyS_unlinkat+0x1b/0x40 > [<ffffffff8173d9dd>] system_call_fastpath+0x1a/0x1f > [<ffffffffffffffff>] 0xffffffffffffffff > >> >> >> Can you do "echo w >/proc/sysrq-trigger", "echo t >/proc/sysrq-trigger" >> on the ceph-client VM and attach dmesg output? > > > https://gist.github.com/harobed/37e23ced839f17d91a0e43435348205a [ 73.086952] libceph: loaded (mon/osd proto 15/24) [ 73.089990] rbd: loaded rbd (rados block device) [ 73.091532] libceph: mon1 172.28.128.3:6789 feature set mismatch, my 4a042a42 < server's 2004a042a42, missing 20000000000 [ 73.092582] libceph: mon1 172.28.128.3:6789 socket error on read Where is this coming from - I thought you said you set tunables to legacy? Or are you rolling out a new set of VMs each time? What if you boot ceph-client-3 with >512M memory, say 2G? What if you manually upgrade the kernel on ceph-client-3 to 3.19.* or whatever is available on trusty? Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com