Re: cephfs-snapshots causing mds failover, hangs

thoralf schulze <t.schulze@xxxxxxxxxxxx> · Mon, 26 Aug 2019 12:57:24 +0200

hi Zheng,

On 8/21/19 4:32 AM, Yan, Zheng wrote:
> Please enable debug mds (debug_mds=10), and try reproducing it again.

please find the logs at
https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz .

we managed to reproduce the issue as a worst case scenario: before
snapshotting, juju-d0f708-5-lxd-1 and juju-d0f708-10-lxd-1 were the
active mds's and juju-d0f708-3-lxd-1 and juju-d0f708-9-lxd-1 standbys.
we created the snapshot at ~08:11:50, a little later the failover
happened and juju-d0f708-5-lxd-1 and juju-d0f708-10-lxd-1 went mia. a
little later still, the now-active juju-d0f708-3-lxd-1 and
juju-d0f708-9-lxd-1 mds's dropped out of the cluster as well. we started
to restart all mds daemons at ~08:16.

thank you very much & with kind regards,
t.

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com