Quoting Stefan Kooman (stefan@xxxxxx): > Hi Patrick, > > Quoting Stefan Kooman (stefan@xxxxxx): > > Quoting Stefan Kooman (stefan@xxxxxx): > > > Quoting Patrick Donnelly (pdonnell@xxxxxxxxxx): > > > > Thanks for the detailed notes. It looks like the MDS is stuck > > > > somewhere it's not even outputting any log messages. If possible, it'd > > > > be helpful to get a coredump (e.g. by sending SIGQUIT to the MDS) or, > > > > if you're comfortable with gdb, a backtrace of any threads that look > > > > suspicious (e.g. not waiting on a futex) including `info threads`. > > > > Today the issue reappeared (after being absent for ~ 3 weeks). This time > > the standby MDS could take over and would not get into a deadlock > > itself. We made gdb traces again, which you can find over here: > > > > https://8n1.org/14011/d444 > > We are still seeing these crashes occur ~ every 3 weeks or so. Have you > find the time to look into the backtraces / gdb dumps? We have not seen this issue anymore for the past three months. We have updated the cluster to 12.2.11 in the meantime, but not sure if that is related. Hopefully it stays away. FYI, Gr. Stefan -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com