On Thu, Jul 20, 2017 at 6:35 PM, Дмитрий Глушенок <glush@xxxxxxxxxx> wrote: > Hi Ilya, > > While trying to reproduce the issue I've found that: > - it is relatively easy to reproduce 5-6 minutes hangs just by killing > active mds process (triggering failover) while writing a lot of data. > Unacceptable timeout, but not the case of > http://tracker.ceph.com/issues/15255 > - it is hard to reproduce the endless hang (I've spent an hour without > success) > > One thing I've noticed analysing logs is that "endless hang" always was > accompanied with following messages: > > Jul 20 15:31:57 mn-ceph-nfs-gw-01 kernel: libceph: mon0 10.50.67.25:6789 > session lost, hunting for new mon > Jul 20 15:31:57 mn-ceph-nfs-gw-01 kernel: libceph: mon1 10.50.67.26:6789 > session established > Jul 20 15:32:27 mn-ceph-nfs-gw-01 kernel: libceph: mon1 10.50.67.26:6789 > session lost, hunting for new mon > Jul 20 15:32:27 mn-ceph-nfs-gw-01 kernel: libceph: mon2 10.50.67.27:6789 > session established > Jul 20 15:32:57 mn-ceph-nfs-gw-01 kernel: libceph: mon2 10.50.67.27:6789 > session lost, hunting for new mon > Jul 20 15:32:57 mn-ceph-nfs-gw-01 kernel: libceph: mon0 10.50.67.25:6789 > session established > Jul 20 15:33:28 mn-ceph-nfs-gw-01 kernel: libceph: mon0 10.50.67.25:6789 > session lost, hunting for new mon > Jul 20 15:33:28 mn-ceph-nfs-gw-01 kernel: libceph: mon2 10.50.67.27:6789 > session established > Jul 20 15:33:58 mn-ceph-nfs-gw-01 kernel: libceph: mon2 10.50.67.27:6789 > session lost, hunting for new mon > Jul 20 15:34:29 mn-ceph-nfs-gw-01 kernel: libceph: mon2 10.50.67.27:6789 > session established > > > Bug http://tracker.ceph.com/issues/17664 describes such behaviour and it was > fixed in releases starting with v11.1.0 (I'm using 10.2.7). So, the lost > session somehow triggers client disconnection and fencing (as described at > http://docs.ceph.com/docs/master/cephfs/troubleshooting/#disconnected-remounted-fs). > > Do you still think it should be posted to > http://tracker.ceph.com/issues/15255 ? Are you using async messenger? You can check with $ ceph daemon mon.X config get ms_type for all MONs. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com