Hi all, We have been benchmarking a hyperconverged cephfs cluster (kernel clients + osd on same machines) for awhile. Over the weekend (for the first time) we had one cephfs mount deadlock while some clients were running ior. All the ior processes are stuck in D state with this stack: [<ffffffffafdb53a3>] wait_on_page_bit+0x83/0xa0 [<ffffffffafdb54d1>] __filemap_fdatawait_range+0x111/0x190 [<ffffffffafdb5564>] filemap_fdatawait_range+0x14/0x30 [<ffffffffafdb79e6>] filemap_write_and_wait_range+0x56/0x90 [<ffffffffc0f11575>] ceph_fsync+0x55/0x420 [ceph] [<ffffffffafe76247>] do_fsync+0x67/0xb0 [<ffffffffafe76530>] SyS_fsync+0x10/0x20 [<ffffffffb0372d5b>] system_call_fastpath+0x22/0x27 [<ffffffffffffffff>] 0xffffffffffffffff We tried restarting the co-located OSDs, and tried evicting the client, but the processes stay deadlocked. We've seen the recent issue related to co-location (https://bugzilla.redhat.com/show_bug.cgi?id=1665248) but we don't have the `usercopy` warning in dmesg. Are there other known issues related to co-locating? Thanks! Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com