Hi Derek, On Mon, Sep 18, 2017 at 1:30 PM, Derek Yarnell <derek@xxxxxxxxxxxxxx> wrote: > We have a recent cluster upgraded from Jewel to Luminous. Today we had > a segmentation fault that led to file system degraded. Systemd then > decided to restart the daemon over and over with a different stack trace > (can be seen after the 10k events in the log file[0]). > > After trying to fail over to the standby which also kept failing. After > shutting down both MDSs for some time we brought one back online and > what seemed to be the clients had been out long enough to be evicted. > We were able to then reboot clients (RHEL 7.4) and have them re-connect > to the file system. This looks like an instance of: http://tracker.ceph.com/issues/21070 Upcoming v12.2.1 has the fix. Until then, you will need to apply the patch locally. -- Patrick Donnelly _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com