On Mon, Jan 28, 2019 at 7:31 AM ST Wong (ITSC) <ST@xxxxxxxxxxxxxxxx> wrote: > > > That doesn't appear to be an error -- that's just stating that it found a dead client that was holding the exclusice-lock, so it broke the dead client's lock on the image (by blacklisting the client). > > As there is only 1 RBD client in this testing, does it mean the RBD client process keeps failing? > In a fresh boot RBD client, doing some basic operations also gives the warning: > > ---------------- cut here ---------------- > # rbd -n client.acapp1 map 4copy/foo > /dev/rbd0 > # mount /dev/rbd0 /4copy > # cd /4copy; ls > > > # tail /var/log/messages > Jan 28 14:23:39 acapp1 kernel: Key type ceph registered > Jan 28 14:23:39 acapp1 kernel: libceph: loaded (mon/osd proto 15/24) > Jan 28 14:23:39 acapp1 kernel: rbd: loaded (major 252) > Jan 28 14:23:39 acapp1 kernel: libceph: mon2 192.168.1.156:6789 session established > Jan 28 14:23:39 acapp1 kernel: libceph: client80624 fsid cc795498-5d16-4b84-9584-1788d0458be9 > Jan 28 14:23:39 acapp1 kernel: rbd: rbd0: capacity 10737418240 features 0x5 > Jan 28 14:23:44 acapp1 kernel: XFS (rbd0): Mounting V5 Filesystem > Jan 28 14:23:44 acapp1 kernel: rbd: rbd0: client80621 seems dead, breaking lock <-- > Jan 28 14:23:45 acapp1 kernel: XFS (rbd0): Starting recovery (logdev: internal) > Jan 28 14:23:45 acapp1 kernel: XFS (rbd0): Ending recovery (logdev: internal) > > ---------------- cut here ---------------- > > Is this normal? Yes -- the lock isn't released because you are hard resetting your machine. When it comes back up, the new client fences the old client to avoid split brain. > > > > Besides, repeated the testing: > * Map and mount the rbd device, read/write ok. > * Umount all rbd, then reboot without problem > * Reboot hangs if not umounting all rbd before reboot: > > ---------------- cut here ---------------- > Jan 28 14:13:12 acapp1 kernel: rbd: rbd0: client80531 seems dead, breaking lock > Jan 28 14:13:13 acapp1 kernel: XFS (rbd0): Ending clean mount <-- Reboot hangs here > Jan 28 14:14:06 acapp1 systemd: Stopping Session 1 of user root. <-- pressing power reset > Jan 28 14:14:06 acapp1 systemd: Stopped target Multi-User System. > ---------------- cut here ---------------- > > Is it necessary to umount all RDB before rebooting the client host? Yes, it's necessary. If you enable rbdmap.service, it should do it for you: https://github.com/ceph/ceph/blob/f52c22ebf5ff24107faf061a8de1f36376ed515d/systemd/rbdmap.service.in#L15 Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com