Hi Zheng, Many, many thanks for your help... Your suggestion of setting large values for mds_cache_size and mds_cache_memory_limit stopped our MDS crashing :) The values in ceph.conf are now: mds_cache_size = 8589934592 mds_cache_memory_limit = 17179869184 Should these values be left in our configuration? again thanks for the assistance, Jake On 2/11/19 8:17 AM, Yan, Zheng wrote: > On Sat, Feb 9, 2019 at 12:36 AM Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote: >> >> Dear All, >> >> Unfortunately the MDS has crashed on our Mimic cluster... >> >> First symptoms were rsync giving: >> "No space left on device (28)" >> when trying to rename or delete >> >> This prompted me to try restarting the MDS, as it reported laggy. >> >> Restarting the MDS, shows this as error in the log before the crash: >> >> elist.h: 39: FAILED assert(!is_on_list()) >> >> A full MDS log showing the crash is here: >> >> http://p.ip.fi/iWlz >> >> I've tried upgrading the cluster to 13.2.4, but the MDS still crashes... >> >> The cluster has 10 nodes, 254 OSD's, uses EC for the data, 3x >> replication for MDS. We have a single active MDS, with two failover MDS >> >> We have ~2PB of cephfs data here, all of which is currently >> inaccessible, all and any advice gratefully received :) >> > > Add mds_cache_size and mds_cache_memory_limit to ceph.conf and set > them to very large values before starting mds. If mds does not crash, > restore the mds_cache_size and mds_cache_memory_limit to their > original values (by admin socket) after mds becomes active for 10 > seconds > > If mds still crash, try compile ceph-mds with following patch > > diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc > index d3461fba2e..c2731e824c 100644 > --- a/src/mds/CDir.cc > +++ b/src/mds/CDir.cc > @@ -508,6 +508,8 @@ void CDir::remove_dentry(CDentry *dn) > // clean? > if (dn->is_dirty()) > dn->mark_clean(); > + if (inode->is_stray()) > + dn->item_stray.remove_myself(); > > if (dn->state_test(CDentry::STATE_BOTTOMLRU)) > cache->bottom_lru.lru_remove(dn); > > >> best regards, >> >> Jake >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com