On Tue, Aug 16, 2016 at 6:29 AM, Randy Orr <randy.orr@xxxxxxxxxx> wrote: > Hi Patrick, > > We continue to hit this bug. Just a couple of questions: > > 1. I see that http://tracker.ceph.com/issues/16983 has been updated and you > believe it is related to http://tracker.ceph.com/issues/16013. It looks like > this fix is scheduled to be backported to Jewel at some point... is there > any sense as to when that might happen and a point release made? > > 2. Looking at the pull request: https://github.com/ceph/ceph/pull/8778 I ran > through the testing steps that were posted and was unable to replicate the > crash. > > 3. When we do hit this condition, what is the best way to recover? I can > continue to restart the MDS services and reboot the hosts, but the condition > remains for some period of time. Even after blacklisting all clients the > condition persists. It's actually unclear to me how/why this is recovering > at all. If it will be some period of time before the fix is released is > there any workaround or temporary solution? > The easiest solution is replace ceph-mds daemon with the patched version. Using fuse client can also avoid this issue (because only kernel client can trigger this bug). Regards Yan, Zheng > Thanks in advance, > Randy > > On Wed, Aug 10, 2016 at 4:38 PM, Randy Orr <randy.orr@xxxxxxxxxx> wrote: >> >> Patrick, >> >> We are using the kernel client. We have a mix of 4.4 and 3.19 kernels on >> the client side with plans to move away from the 3.19 kernel where/when we >> can. >> >> -Randy >> >> On Wed, Aug 10, 2016 at 4:24 PM, Patrick Donnelly <pdonnell@xxxxxxxxxx> >> wrote: >>> >>> Randy, are you using ceph-fuse or the kernel client (or something else)? >>> >>> On Wed, Aug 10, 2016 at 2:33 PM, Randy Orr <randy.orr@xxxxxxxxxx> wrote: >>> > Great, thank you. Please let me know if I can be of any assistance in >>> > testing or validating a fix. >>> > >>> > -Randy >>> > >>> > On Wed, Aug 10, 2016 at 1:21 PM, Patrick Donnelly <pdonnell@xxxxxxxxxx> >>> > wrote: >>> >> >>> >> Hello Randy, >>> >> >>> >> On Wed, Aug 10, 2016 at 12:20 PM, Randy Orr <randy.orr@xxxxxxxxxx> >>> >> wrote: >>> >> > mds/Locker.cc: In function 'bool >>> >> > Locker::check_inode_max_size(CInode*, >>> >> > bool, >>> >> > bool, uint64_t, bool, uint64_t, utime_t)' thread 7fc305b83700 time >>> >> > 2016-08-09 18:51:50.626630 >>> >> > mds/Locker.cc: 2190: FAILED assert(in->is_file()) >>> >> > >>> >> > ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) >>> >> > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> >> > const*)+0x8b) [0x563d1e0a2d3b] >>> >> > 2: (Locker::check_inode_max_size(CInode*, bool, bool, unsigned >>> >> > long, >>> >> > bool, >>> >> > unsigned long, utime_t)+0x15e3) [0x563d1de506a3] >>> >> > 3: >>> >> > (Server::handle_client_open(std::shared_ptr<MDRequestImpl>&)+0x1061) >>> >> > [0x563d1dd386a1] >>> >> > 4: >>> >> > >>> >> > (Server::dispatch_client_request(std::shared_ptr<MDRequestImpl>&)+0xa0b) >>> >> > [0x563d1dd5709b] >>> >> > 5: (Server::handle_client_request(MClientRequest*)+0x47f) >>> >> > [0x563d1dd5768f] >>> >> > 6: (Server::dispatch(Message*)+0x3bb) [0x563d1dd5b8db] >>> >> > 7: (MDSRank::handle_deferrable_message(Message*)+0x80c) >>> >> > [0x563d1dce1f8c] >>> >> > 8: (MDSRank::_dispatch(Message*, bool)+0x1e1) [0x563d1dceb081] >>> >> > 9: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x563d1dcec1d5] >>> >> > 10: (MDSDaemon::ms_dispatch(Message*)+0xc3) [0x563d1dcd3f83] >>> >> > 11: (DispatchQueue::entry()+0x78b) [0x563d1e1996cb] >>> >> > 12: (DispatchQueue::DispatchThread::entry()+0xd) [0x563d1e08862d] >>> >> > 13: (()+0x8184) [0x7fc30bd7c184] >>> >> > 14: (clone()+0x6d) [0x7fc30a2d337d] >>> >> > NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> >> > needed to >>> >> > interpret this. >>> >> >>> >> I have a bug report filed for this issue: >>> >> http://tracker.ceph.com/issues/16983 >>> >> >>> >> I believe it should be straightforward to solve and we'll have a fix >>> >> for it soon. >>> >> >>> >> Thanks for the report! >>> >> >>> >> -- >>> >> Patrick Donnelly >>> > >>> > >>> > >>> > _______________________________________________ >>> > ceph-users mailing list >>> > ceph-users@xxxxxxxxxxxxxx >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> >>> >>> >>> -- >>> Patrick Donnelly >> >> > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com