Hi, we’re trying mimic on an VM farm. It consists 4 OSD hosts (8 OSDs) and 3 MON. We tried mounting as RBD and CephFS (fuse and kernel mount) on different clients without problem. Then one day we perform failover test and stopped one of the OSD. Not sure if it’s related but after that testing, the RBD client freeze when trying to mount the rbd device.
Steps to reproduce: # modprobe rbd (dmesg) [ 309.997587] Key type dns_resolver registered [ 310.043647] Key type ceph registered [ 310.044325] libceph: loaded (mon/osd proto 15/24) [ 310.054548] rbd: loaded # rbd -n client.acapp1 map 4copy/foo /dev/rbd0 # rbd showmapped id pool image snap device 0 4copy foo - /dev/rbd0 Then hangs if I tried to mount or reboot the server after rbd map. There are lot of error in dmesg, e.g. Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: blacklist of client74700 failed: -13 Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: failed to acquire lock: -13 Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: no lock owners detected Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: client74700 seems dead, breaking lock Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: blacklist of client74700 failed: -13 Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: failed to acquire lock: -13 Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: no lock owners detected The version we use followings: # ceph -v ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable) # modinfo rbd filename: /lib/modules/3.10.0-862.11.6.el7.x86_64/kernel/drivers/block/rbd.ko.xz license: GPL description: RADOS Block Device (RBD) driver author: Jeff Garzik <jeff@xxxxxxxxxx> author: Yehuda Sadeh <yehuda@xxxxxxxxxxxxxxx> author: Sage Weil <sage@xxxxxxxxxxxx> author: Alex Elder <elder@xxxxxxxxxxx> retpoline: Y rhelversion: 7.5 srcversion: 3486A669C909DC30C49A49C depends: libceph intree: Y vermagic: 3.10.0-862.11.6.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: D4:11:5F:11:00:55:DB:56:C8:D6:05:AB:75:21:73:CF:B1:AC:54:D8 sig_hashalgo: sha256 parm: single_major:Use a single major number for all rbd devices (default: false) (bool) There is no problem with CephFS clients at all.
Would anyone please help? Thanks and regards /st wong |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com