Re: mds crashing

Markus Blank-Burian <burian@xxxxxxxxxxx> · Tue, 19 May 2015 10:31:14 +0200

I am afraid, I hit the same bug. Giant worked fine, but after upgrading to hammer (0.94.1) and putting some load on it, the MDSs eventually crashed and now I am stuck in clientreplay most of the time. I am also using the cephfs kernel client (3.18.y). As I didn't find a corresponding tracker entry .. is there already a patch available?

Markus

> From: mozes@xxxxxxx
> Date: Thu, 16 Apr 2015 08:04:13 -0500
> To: ukernel@xxxxxxxxx
> CC: ceph-users@xxxxxxxxxxxxxx
> Subject: Re: [ceph-users] mds crashing
> 
> (Adding back to the list)
> 
> We've not seen any slow requests near that badly behind. Leading up to
> the crash, the furthest behind I saw any request was ~90 seconds. Here
> is the cluster log leading up to the mds crashes.
> http://people.beocat.cis.ksu.edu/~mozes/ceph-mds-crashes-20150415.log
> 
> --
> Adam
> 
> On Thu, Apr 16, 2015 at 1:35 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> > On Thu, Apr 16, 2015 at 10:44 AM, Adam Tygart <mozes@xxxxxxx> wrote:
> >> We did that just after Kyle responded to John Spray above. I am
> >> rebuilding the kernel now to include dynamic printk support.
> >>
> >
> > Maybe the first crash was caused by hang request in MDS. Is there
> > warnings like "cluster [WRN] slow request [several thousands or more ]
> > seconds old, received at ...: client_request(client.734537:23 getattr
> > pAsLsXsFs ...) "  in your ceph cluster log.
> >
> > Regards
> > Yan, Zheng
> >
> >> --
> >> Adam
> >>
> >> On Wed, Apr 15, 2015 at 9:37 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >>> On Thu, Apr 16, 2015 at 10:24 AM, Adam Tygart <mozes@xxxxxxx> wrote:
> >>>> I don't have "dynamic_debug" enabled in the currently running kernel,
> >>>> so I can't bump the verbosity of the ceph functions. I can rebuild the
> >>>> kernel and reboot it to enable dynamic_debug, but then we'll have to
> >>>> wait for when we re-trigger the bug. Attached is the mdsc file.
> >>>>
> >>>> As for getting the mds back running, we put a route in the faulty
> >>>> client to redirect ceph traffic to the loopback device. Started the
> >>>> mds again, waited for the full startup sequence to finish for the mds
> >>>> and re-set the normal routing. That seemed to cleanup the existing
> >>>> session and allow the mds to live and the client to reconnect. With
> >>>> the above mds requests still pending/hung, of course.
> >>>
> >>> did you do the trick before? the trick leaves the client in ill state.
> >>> MDS will crash again after the client sends another 3M requests to it.
> >>>
> >>> Regards
> >>> Yan, Zheng
> >>>
> >>>>
> >>>> --
> >>>> Adam
> >>>>
> >>>> On Wed, Apr 15, 2015 at 9:04 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >>>>> On Thu, Apr 16, 2015 at 9:48 AM, Adam Tygart <mozes@xxxxxxx> wrote:
> >>>>>> What is significantly smaller? We have 67 requests in the 16,400,000
> >>>>>> range and 250 in the 18,900,000 range.
> >>>>>>
> >>>>>
> >>>>> that explains the crash. could you help me to debug this issue.
> >>>>>
> >>>>>  send /sys/kernel/debug/ceph/*/mdsc to me.
> >>>>>
> >>>>>  run "echo module ceph +p > /sys/kernel/debug/dynamic_debug/control"
> >>>>> on the cephfs mount machine
> >>>>>  restart the mds and wait until it crash again
> >>>>>  run "echo module ceph -p > /sys/kernel/debug/dynamic_debug/control"
> >>>>> on the cephfs mount machine
> >>>>>  send kernel message of the cephfs mount machine to me (should in
> >>>>> /var/log/kerne.log or /var/log/message)
> >>>>>
> >>>>> to recover from the crash. you can either force reset the machine
> >>>>> contains cephfs mount or add "mds wipe sessions = 1" to mds section of
> >>>>> ceph.conf
> >>>>>
> >>>>> Regards
> >>>>> Yan, Zheng
> >>>>>
> >>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Adam
> >>>>>>
> >>>>>> On Wed, Apr 15, 2015 at 8:38 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >>>>>>> On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart <mozes@xxxxxxx> wrote:
> >>>>>>>> We are using 3.18.6-gentoo. Based on that, I was hoping that the
> >>>>>>>> kernel bug referred to in the bug report would have been fixed.
> >>>>>>>>
> >>>>>>>
> >>>>>>> The bug was supposed to be fixed, but you hit the bug again. could you
> >>>>>>> check if the kernel client has any hang mds request. (check
> >>>>>>> /sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs
> >>>>>>> mount. If there is any request whose ID is significant smaller than
> >>>>>>> other requests' IDs)
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Yan, Zheng
> >>>>>>>
> >>>>>>>> --
> >>>>>>>> Adam
> >>>>>>>>
> >>>>>>>> On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >>>>>>>>> On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson <kylehutson@xxxxxxx> wrote:
> >>>>>>>>>> Thank you, John!
> >>>>>>>>>>
> >>>>>>>>>> That was exactly the bug we were hitting. My Google-fu didn't lead me to
> >>>>>>>>>> this one.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> here is the bug report http://tracker.ceph.com/issues/10449. It's a
> >>>>>>>>> kernel client bug which causes the session map size increase
> >>>>>>>>> infinitely. which version of linux kernel are using?
> >>>>>>>>>
> >>>>>>>>> Regards
> >>>>>>>>> Yan, Zheng
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Apr 15, 2015 at 4:16 PM, John Spray <john.spray@xxxxxxxxxx> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> On 15/04/2015 20:02, Kyle Hutson wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going
> >>>>>>>>>>>> pretty well.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Then, about noon today, we had an mds crash. And then the failover mds
> >>>>>>>>>>>> crashed. And this cascaded through all 4 mds servers we have.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If I try to start it ('service ceph start mds' on CentOS 7.1), it appears
> >>>>>>>>>>>> to be OK for a little while. ceph -w goes through 'replay' 'reconnect'
> >>>>>>>>>>>> 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting to
> >>>>>>>>>>>> 'active', it crashes again.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I have the mds log at
> >>>>>>>>>>>> http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log
> >>>>>>>>>>>> <http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log>
> >>>>>>>>>>>>
> >>>>>>>>>>>> For the possibly, but not necessarily, useful background info.
> >>>>>>>>>>>> - Yesterday we took our erasure coded pool and increased both pg_num and
> >>>>>>>>>>>> pgp_num from 2048 to 4096. We still have several objects misplaced (~17%),
> >>>>>>>>>>>> but those seem to be continuing to clean themselves up.
> >>>>>>>>>>>> - We are in the midst of a large (300+ TB) rsync from our old (non-ceph)
> >>>>>>>>>>>> filesystem to this filesystem.
> >>>>>>>>>>>> - Before we realized the mds crashes, we had just changed the size of our
> >>>>>>>>>>>> metadata pool from 2 to 4.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> It looks like you're seeing http://tracker.ceph.com/issues/10449, which is
> >>>>>>>>>>> a situation where the SessionMap object becomes too big for the MDS to
> >>>>>>>>>>> save.The cause of it in that case was stuck requests from a misbehaving
> >>>>>>>>>>> client running a slightly older kernel.
> >>>>>>>>>>>
> >>>>>>>>>>> Assuming you're using the kernel client and having a similar problem, you
> >>>>>>>>>>> could try to work around this situation by forcibly unmounting the clients
> >>>>>>>>>>> while the MDS is offline, such that during clientreplay the MDS will remove
> >>>>>>>>>>> them from the SessionMap after timing out, and then next time it tries to
> >>>>>>>>>>> save the map it won't be oversized.  If that works, you could then look into
> >>>>>>>>>>> getting newer kernels on the clients to avoid hitting the issue again -- the
> >>>>>>>>>>> #10449 ticket has some pointers about which kernel changes were relevant.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> John
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> ceph-users mailing list
> >>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
> >>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> ceph-users mailing list
> >>>>>>>>> ceph-users@xxxxxxxxxxxxxx
> >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com