Re: MDS lost, Filesystem degraded and wont mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Janek,

I'd love to hear your standard maintenance procedures. Are you
cleaning up those open files outside of "rejoin" OOMs ?

I guess we're pretty lucky with our CephFS's because we have more than
1k clients and it is pretty solid (though the last upgrade had a
hiccup decreasing down to single active MDS).

-- Dan



On Fri, Dec 4, 2020 at 8:20 PM Janek Bevendorff
<janek.bevendorff@xxxxxxxxxxxxx> wrote:
>
> This is very common issue. Deleting mdsX_openfiles.Y has become part of
> my standard maintenance repertoire. As soon as you have a few more
> clients and one of them starts opening and closing files in rapid
> succession (or does other metadata-heavy things), it becomes very likely
> that the MDS crashes and is unable to recover.
>
> There have been numerous fixes in the past, which improved the overall
> stability, but it is far from perfect. I am happy to see another patch
> in that direction, but I believe more effort needs to be spent here. It
> is way too easy to DoS the MDS from a single client. Our 78-node CephFS
> beats our old NFS RAID server in terms of throughput, but latency and
> stability are way behind.
>
> Janek
>
> On 04/12/2020 11:39, Dan van der Ster wrote:
> > Excellent!
> >
> > For the record, this PR is the plan to fix this:
> > https://github.com/ceph/ceph/pull/36089
> > (nautilus, octopus PRs here: https://github.com/ceph/ceph/pull/37382
> > https://github.com/ceph/ceph/pull/37383)
> >
> > Cheers, Dan
> >
> > On Fri, Dec 4, 2020 at 11:35 AM Anton Aleksandrov <anton@xxxxxxxxxxxxxx> wrote:
> >> Thank you very much! This solution helped:
> >>
> >> Stop all MDS, then:
> >> # rados -p cephfs_metadata_pool rm mds0_openfiles.0
> >> then start one MDS.
> >>
> >> We are back online. Amazing!!!  :)
> >>
> >>
> >> On 04.12.2020 12:20, Dan van der Ster wrote:
> >>> Please also make sure the mds_beacon_grace is high on the mon's too.
> >>>
> >>> it doesn't matter which mds you select to be the running one.
> >>>
> >>> Is the processing getting killed, restarted?
> >>> If you're confident that the mds is getting OOM killed during rejoin
> >>> step, then you might find this useful:
> >>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-August/028964.html
> >>>
> >>> Stop all MDS, then:
> >>> # rados -p cephfs_metadata_pool rm mds0_openfiles.0
> >>> then start one MDS.
> >>>
> >>> -- Dan
> >>>
> >>> On Fri, Dec 4, 2020 at 11:05 AM Anton Aleksandrov <anton@xxxxxxxxxxxxxx> wrote:
> >>>> Yes, MDS eats all memory+swap, stays like this for a moment and then
> >>>> frees memory.
> >>>>
> >>>> mds_beacon_grace was already set to 1800
> >>>>
> >>>> Also on other it is seen this message: Map has assigned me to become a
> >>>> standby.
> >>>>
> >>>> Does it matter, which MDS we stop and which we leave running?
> >>>>
> >>>> Anton
> >>>>
> >>>>
> >>>> On 04.12.2020 11:53, Dan van der Ster wrote:
> >>>>> How many active MDS's did you have? (max_mds == 1, right?)
> >>>>>
> >>>>> Stop the other two MDS's so you can focus on getting exactly one running.
> >>>>> Tail the log file and see what it is reporting.
> >>>>> Increase mds_beacon_grace to 600 so that the mon doesn't fail this MDS
> >>>>> while it is rejoining.
> >>>>>
> >>>>> Is that single MDS running out of memory during the rejoin phase?
> >>>>>
> >>>>> -- dan
> >>>>>
> >>>>> On Fri, Dec 4, 2020 at 10:49 AM Anton Aleksandrov <anton@xxxxxxxxxxxxxx> wrote:
> >>>>>> Hello community,
> >>>>>>
> >>>>>> we are on ceph 13.2.8 - today something happenned with one MDS and cephs
> >>>>>> status tells, that filesystem is degraded. It won't mount either. I have
> >>>>>> take server with MDS, that was not working down. There are 2 more MDS
> >>>>>> servers, but they stay in "rejoin" state. Also only 1 is shown in
> >>>>>> "services", even though there are 2.
> >>>>>>
> >>>>>> Both running MDS servers have these lines in their logs:
> >>>>>>
> >>>>>> heartbeat_map is_healthy 'MDSRank' had timed out after 15
> >>>>>> mds.beacon.mds2 Skipping beacon heartbeat to monitors (last acked
> >>>>>> 28.8979s ago); MDS internal heartbeat is not healthy!
> >>>>>>
> >>>>>> On one of MDS nodes I enabled more detailed debug, so I am getting there
> >>>>>> also:
> >>>>>>
> >>>>>> mds.beacon.mds3 Sending beacon up:standby seq 178
> >>>>>> mds.beacon.mds3 received beacon reply up:standby seq 178 rtt 0.000999968
> >>>>>>
> >>>>>> Makes no sense and too much stress in my head... Anyone could help please?
> >>>>>>
> >>>>>> Anton.
> >>>>>> _______________________________________________
> >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux