(Adding back to the list) We've not seen any slow requests near that badly behind. Leading up to the crash, the furthest behind I saw any request was ~90 seconds. Here is the cluster log leading up to the mds crashes. http://people.beocat.cis.ksu.edu/~mozes/ceph-mds-crashes-20150415.log -- Adam On Thu, Apr 16, 2015 at 1:35 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > On Thu, Apr 16, 2015 at 10:44 AM, Adam Tygart <mozes@xxxxxxx> wrote: >> We did that just after Kyle responded to John Spray above. I am >> rebuilding the kernel now to include dynamic printk support. >> > > Maybe the first crash was caused by hang request in MDS. Is there > warnings like "cluster [WRN] slow request [several thousands or more ] > seconds old, received at ...: client_request(client.734537:23 getattr > pAsLsXsFs ...) " in your ceph cluster log. > > Regards > Yan, Zheng > >> -- >> Adam >> >> On Wed, Apr 15, 2015 at 9:37 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>> On Thu, Apr 16, 2015 at 10:24 AM, Adam Tygart <mozes@xxxxxxx> wrote: >>>> I don't have "dynamic_debug" enabled in the currently running kernel, >>>> so I can't bump the verbosity of the ceph functions. I can rebuild the >>>> kernel and reboot it to enable dynamic_debug, but then we'll have to >>>> wait for when we re-trigger the bug. Attached is the mdsc file. >>>> >>>> As for getting the mds back running, we put a route in the faulty >>>> client to redirect ceph traffic to the loopback device. Started the >>>> mds again, waited for the full startup sequence to finish for the mds >>>> and re-set the normal routing. That seemed to cleanup the existing >>>> session and allow the mds to live and the client to reconnect. With >>>> the above mds requests still pending/hung, of course. >>> >>> did you do the trick before? the trick leaves the client in ill state. >>> MDS will crash again after the client sends another 3M requests to it. >>> >>> Regards >>> Yan, Zheng >>> >>>> >>>> -- >>>> Adam >>>> >>>> On Wed, Apr 15, 2015 at 9:04 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>>>> On Thu, Apr 16, 2015 at 9:48 AM, Adam Tygart <mozes@xxxxxxx> wrote: >>>>>> What is significantly smaller? We have 67 requests in the 16,400,000 >>>>>> range and 250 in the 18,900,000 range. >>>>>> >>>>> >>>>> that explains the crash. could you help me to debug this issue. >>>>> >>>>> send /sys/kernel/debug/ceph/*/mdsc to me. >>>>> >>>>> run "echo module ceph +p > /sys/kernel/debug/dynamic_debug/control" >>>>> on the cephfs mount machine >>>>> restart the mds and wait until it crash again >>>>> run "echo module ceph -p > /sys/kernel/debug/dynamic_debug/control" >>>>> on the cephfs mount machine >>>>> send kernel message of the cephfs mount machine to me (should in >>>>> /var/log/kerne.log or /var/log/message) >>>>> >>>>> to recover from the crash. you can either force reset the machine >>>>> contains cephfs mount or add "mds wipe sessions = 1" to mds section of >>>>> ceph.conf >>>>> >>>>> Regards >>>>> Yan, Zheng >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Adam >>>>>> >>>>>> On Wed, Apr 15, 2015 at 8:38 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>>>>>> On Thu, Apr 16, 2015 at 9:07 AM, Adam Tygart <mozes@xxxxxxx> wrote: >>>>>>>> We are using 3.18.6-gentoo. Based on that, I was hoping that the >>>>>>>> kernel bug referred to in the bug report would have been fixed. >>>>>>>> >>>>>>> >>>>>>> The bug was supposed to be fixed, but you hit the bug again. could you >>>>>>> check if the kernel client has any hang mds request. (check >>>>>>> /sys/kernel/debug/ceph/*/mdsc on the machine that contain cephfs >>>>>>> mount. If there is any request whose ID is significant smaller than >>>>>>> other requests' IDs) >>>>>>> >>>>>>> Regards >>>>>>> Yan, Zheng >>>>>>> >>>>>>>> -- >>>>>>>> Adam >>>>>>>> >>>>>>>> On Wed, Apr 15, 2015 at 8:02 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: >>>>>>>>> On Thu, Apr 16, 2015 at 5:29 AM, Kyle Hutson <kylehutson@xxxxxxx> wrote: >>>>>>>>>> Thank you, John! >>>>>>>>>> >>>>>>>>>> That was exactly the bug we were hitting. My Google-fu didn't lead me to >>>>>>>>>> this one. >>>>>>>>> >>>>>>>>> >>>>>>>>> here is the bug report http://tracker.ceph.com/issues/10449. It's a >>>>>>>>> kernel client bug which causes the session map size increase >>>>>>>>> infinitely. which version of linux kernel are using? >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Yan, Zheng >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Apr 15, 2015 at 4:16 PM, John Spray <john.spray@xxxxxxxxxx> wrote: >>>>>>>>>>> >>>>>>>>>>> On 15/04/2015 20:02, Kyle Hutson wrote: >>>>>>>>>>>> >>>>>>>>>>>> I upgraded to 0.94.1 from 0.94 on Monday, and everything had been going >>>>>>>>>>>> pretty well. >>>>>>>>>>>> >>>>>>>>>>>> Then, about noon today, we had an mds crash. And then the failover mds >>>>>>>>>>>> crashed. And this cascaded through all 4 mds servers we have. >>>>>>>>>>>> >>>>>>>>>>>> If I try to start it ('service ceph start mds' on CentOS 7.1), it appears >>>>>>>>>>>> to be OK for a little while. ceph -w goes through 'replay' 'reconnect' >>>>>>>>>>>> 'rejoin' 'clientreplay' and 'active' but nearly immediately after getting to >>>>>>>>>>>> 'active', it crashes again. >>>>>>>>>>>> >>>>>>>>>>>> I have the mds log at >>>>>>>>>>>> http://people.beocat.cis.ksu.edu/~kylehutson/ceph-mds.hobbit01.log >>>>>>>>>>>> <http://people.beocat.cis.ksu.edu/%7Ekylehutson/ceph-mds.hobbit01.log> >>>>>>>>>>>> >>>>>>>>>>>> For the possibly, but not necessarily, useful background info. >>>>>>>>>>>> - Yesterday we took our erasure coded pool and increased both pg_num and >>>>>>>>>>>> pgp_num from 2048 to 4096. We still have several objects misplaced (~17%), >>>>>>>>>>>> but those seem to be continuing to clean themselves up. >>>>>>>>>>>> - We are in the midst of a large (300+ TB) rsync from our old (non-ceph) >>>>>>>>>>>> filesystem to this filesystem. >>>>>>>>>>>> - Before we realized the mds crashes, we had just changed the size of our >>>>>>>>>>>> metadata pool from 2 to 4. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> It looks like you're seeing http://tracker.ceph.com/issues/10449, which is >>>>>>>>>>> a situation where the SessionMap object becomes too big for the MDS to >>>>>>>>>>> save.The cause of it in that case was stuck requests from a misbehaving >>>>>>>>>>> client running a slightly older kernel. >>>>>>>>>>> >>>>>>>>>>> Assuming you're using the kernel client and having a similar problem, you >>>>>>>>>>> could try to work around this situation by forcibly unmounting the clients >>>>>>>>>>> while the MDS is offline, such that during clientreplay the MDS will remove >>>>>>>>>>> them from the SessionMap after timing out, and then next time it tries to >>>>>>>>>>> save the map it won't be oversized. If that works, you could then look into >>>>>>>>>>> getting newer kernels on the clients to avoid hitting the issue again -- the >>>>>>>>>>> #10449 ticket has some pointers about which kernel changes were relevant. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> John >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> ceph-users mailing list >>>>>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> ceph-users mailing list >>>>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com