Re: CephFS Segfault 12.2.0

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Mon, 18 Sep 2017 13:48:19 -0700

Hi Derek,

On Mon, Sep 18, 2017 at 1:30 PM, Derek Yarnell <derek@xxxxxxxxxxxxxx> wrote:
> We have a recent cluster upgraded from Jewel to Luminous.  Today we had
> a segmentation fault that led to file system degraded.  Systemd then
> decided to restart the daemon over and over with a different stack trace
> (can be seen after the 10k events in the log file[0]).
>
> After trying to fail over to the standby which also kept failing.  After
> shutting down both MDSs for some time we brought one back online and
> what seemed to be the clients had been out long enough to be evicted.
> We were able to then reboot clients (RHEL 7.4) and have them re-connect
> to the file system.

This looks like an instance of:

http://tracker.ceph.com/issues/21070

Upcoming v12.2.1 has the fix. Until then, you will need to apply the
patch locally.

-- 
Patrick Donnelly
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com