I am afraid, I hit the same bug. Giant worked fine, but after upgrading to hammer (0.94.1) and putting some load on it, the MDSs eventually crashed and now I am stuck in clientreplay most of the time. I am also using the cephfs kernel client (3.18.y). As I didn't find a corresponding tracker entry .. is there already a patch available?
Markus > From: mozes@xxxxxxx > Date: Thu, 16 Apr 2015 08:04:13 -0500 > To: ukernel@xxxxxxxxx > CC: ceph-users@xxxxxxxxxxxxxx > Subject: Re: [ceph-users] mds crashing > > (Adding back to the list) > > We've not seen any slow requests near that badly behind. Leading up to > the crash, the furthest behind I saw any request was ~90 seconds. Here > is the cluster log leading up to the mds crashes. > http://people.beocat.cis.ksu.edu/~mozes/ceph-mds-crashes-20150415.log > > -- > Adam > > On Thu, Apr 16, 2015 at 1:35 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > On Thu, Apr 16, 2015 at 10:44 AM, Adam Tygart <mozes@xxxxxxx> wrote: > >> We did that just after Kyle responded to John Spray above. I am > >> rebuilding the kernel now to include dynamic printk support. > >> > > > > Maybe the first crash was caused by hang request in MDS. Is there > > warnings like "cluster [WRN] slow request [several thousands or more ] > > seconds old, received at ...: client_request(client.734537:23 getattr > > pAsLsXsFs ...) " in your ceph cluster log. > > > > Regards > > Yan, Zheng > > > >> -- > >> Adam > >> > >> On Wed, Apr 15, 2015 at 9:37 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>> On Thu, Apr 16, 2015 at 10:24 AM, Adam Tygart <mozes@xxxxxxx> wrote: > >>>> I don't have "dynamic_debug" enabled in the currently running kernel, > >>>> so I can't bump the verbosity of the ceph functions. I can rebuild the > >>>> kernel and reboot it to enable dynamic_debug, but then we'll have to > >>>> wait for when we re-trigger the bug. Attached is the mdsc file. > >>>> > >>>> As for getting the mds back running, we put a route in the faulty > >>>> client to redirect ceph traffic to the loopback device. Started the > >>>> mds again, waited for the full startup sequence to finish for the mds > >>>> and re-set the normal routing. That seemed to cleanup the existing > >>>> session and allow the mds to live and the client to reconnect. With > >>>> the above mds requests still pending/hung, of course. > >>> > >>> did you do the trick before? the trick leaves the client in ill state. > >>> MDS will crash again after the client sends another 3M requests to it. > >>> > >>> Regards > >>> Yan, Zheng > >>> > >>>> > >>>> -- > >>>> Adam > >>>> > >>>> On Wed, Apr 15, 2015 at 9:04 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>>>> On Thu, Apr 16, 2015 at 9:48 AM, Adam Tygart <mozes@xxxxxxx> wrote: > >>>>>> What is significantly smaller? We have 67 requests in the 16,400,000 > >>>>>> range and 250 in the 18,900,000 range. > >>>>>> > >>>>> > >>>>> that explains the crash. could you help me to debug this issue. > >>>>> > >>>>> send /sys/kernel/debug/ceph/*/mdsc to me. > >>>>> > >>>>> run "echo module ceph +p > /sys/kernel/debug/dynamic_debug/control" > >>>>> on the cephfs mount machine > >>>>> restart the mds and wait until it crash again > >>>>> run "echo module ceph -p > /sys/kernel/debug/dynamic_debug/control" > >>>>> on the cephfs mount machine > >>>>> send kernel message of the cephfs mount machine to me (should in > >>>>> /var/log/kerne.log or /var/log/message) > >>>>> > >>>>> to recover from the crash. you can either force reset the machine > >>>>> contains cephfs mount or add "mds wipe sessions = 1" to mds section of > >>>>> ceph.conf > >>>>> > >>>>> Regards > >>>>> Yan, Zheng > >>>>> > >>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Adam > >>>>>> > >>>>>> On Wed, Apr 15, 2015 at 8:38 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>>>>>> On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart <mozes@xxxxxxx> wrote: > >>>>>>>> We are using 3.18.6-gentoo. Based on that, I was hoping that the > >>>>>>>> kernel bug referred to in the bug report would have been fixed. > >>>>>>>> > >>>>>>> > >>>>>>> The bug was supposed to be fixed, but you hit the bug again. could you > >>>>>>> check if the kernel client has any hang mds request. (check > >>>>>>> /sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs > >>>>>>> mount. If there is any request whose ID is significant smaller than > >>>>>>> other requests' IDs) > >>>>>>> > >>>>>>> Regards > >>>>>>> Yan, Zheng > >>>>>>> > >>>>>>>> -- > >>>>>>>> Adam > >>>>>>>> > >>>>>>>> On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>>>>>>>> On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson <kylehutson@xxxxxxx> wrote: > >>>>>>>>>> Thank you, John! > >>>>>>>>>> > >>>>>>>>>> That was exactly the bug we were hitting. My Google-fu didn't lead me to > >>>>>>>>>> this one. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> here is the bug report http://tracker.ceph.com/issues/10449. It's a > >>>>>>>>> kernel client bug which causes the session map size increase > >>>>>>>>> infinitely. which version of linux kernel are using? > >>>>>>>>> > >>>>>>>>> Regards > >>>>>>>>> Yan, Zheng > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, Apr 15, 2015 at 4:16 PM, John Spray <john.spray@xxxxxxxxxx> wrote: > >>>>>>>>>>> > >>>>>>>>>>> On 15/04/2015 20:02, Kyle Hutson wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going > >>>>>>>>>>>> pretty well. > >>>>>>>>>>>> > >>>>>>>>>>>> Then, about noon today, we had an mds crash. And then the failover mds > >>>>>>>>>>>> crashed. And this cascaded through all 4 mds servers we have. > >>>>>>>>>>>> > >>>>>>>>>>>> If I try to start it ('service ceph start mds' on CentOS 7.1), it appears > >>>>>>>>>>>> to be OK for a little while. ceph -w goes through 'replay' 'reconnect' > >>>>>>>>>>>> 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting to > >>>>>>>>>>>> 'active', it crashes again. > >>>>>>>>>>>> > >>>>>>>>>>>> I have the mds log at > >>>>>>>>>>>> http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log > >>>>>>>>>>>> <http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log> > >>>>>>>>>>>> > >>>>>>>>>>>> For the possibly, but not necessarily, useful background info. > >>>>>>>>>>>> - Yesterday we took our erasure coded pool and increased both pg_num and > >>>>>>>>>>>> pgp_num from 2048 to 4096. We still have several objects misplaced (~17%), > >>>>>>>>>>>> but those seem to be continuing to clean themselves up. > >>>>>>>>>>>> - We are in the midst of a large (300+ TB) rsync from our old (non-ceph) > >>>>>>>>>>>> filesystem to this filesystem. > >>>>>>>>>>>> - Before we realized the mds crashes, we had just changed the size of our > >>>>>>>>>>>> metadata pool from 2 to 4. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> It looks like you're seeing http://tracker.ceph.com/issues/10449, which is > >>>>>>>>>>> a situation where the SessionMap object becomes too big for the MDS to > >>>>>>>>>>> save.The cause of it in that case was stuck requests from a misbehaving > >>>>>>>>>>> client running a slightly older kernel. > >>>>>>>>>>> > >>>>>>>>>>> Assuming you're using the kernel client and having a similar problem, you > >>>>>>>>>>> could try to work around this situation by forcibly unmounting the clients > >>>>>>>>>>> while the MDS is offline, such that during clientreplay the MDS will remove > >>>>>>>>>>> them from the SessionMap after timing out, and then next time it tries to > >>>>>>>>>>> save the map it won't be oversized. If that works, you could then look into > >>>>>>>>>>> getting newer kernels on the clients to avoid hitting the issue again -- the > >>>>>>>>>>> #10449 ticket has some pointers about which kernel changes were relevant. > >>>>>>>>>>> > >>>>>>>>>>> Cheers, > >>>>>>>>>>> John > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> ceph-users mailing list > >>>>>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> ceph-users mailing list > >>>>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com