_______________________________________________Hi,
I’m very much hoping someone can unblock me on this – we recently ran into a very odd issue – I sent an earlier email to the list
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033579.html
After unsuccessfully trying to repair we decided to forsake the Filesystem
I marked the cluster down, failed the MDSs, removed the FS and the metadata and data pools.
Then created a new Filesystem from scratch.
However, I am still observing MDS segfaulting when a client tries to connect. This is quite urgent for me as we don’t have a functioning Filesystem – if someone can advise how I can remove any and all state please do so – I just want to start fresh. I am very puzzled that a brand new FS doesn’t work
Here is the MDS log at level 20 – one odd thing I notice is that the client seems to start showing ? as the id well before the segfault…In any case, I’m just asking what needs to be done to remove all state from the MDS nodes:
2019-03-08 19:30:12.024535 7f25ec184700 20 mds.0.server get_session have 0x5477e00 client.2160819875 <client_ip>:0/945029522 state open
2019-03-08 19:30:12.024537 7f25ec184700 15 mds.0.server oldest_client_tid=1
2019-03-08 19:30:12.024564 7f25ec184700 7 mds.0.cache request_start request(client.?:1 cr=0x54a8680)
2019-03-08 19:30:12.024566 7f25ec184700 7 mds.0.server dispatch_client_request client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 19:29:15.425510 RETRY=2) v2
2019-03-08 19:30:12.024576 7f25ec184700 10 mds.0.server rdlock_path_pin_ref request(client.?:1 cr=0x54a8680) #1
2019-03-08 19:30:12.024577 7f25ec184700 7 mds.0.cache traverse: opening base ino 1 snap head
2019-03-08 19:30:12.024579 7f25ec184700 10 mds.0.cache path_traverse finish on snapid head
2019-03-08 19:30:12.024580 7f25ec184700 10 mds.0.server ref is [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024589 7f25ec184700 10 mds.0.locker acquire_locks request(client.?:1 cr=0x54a8680)
2019-03-08 19:30:12.024591 7f25ec184700 20 mds.0.locker must rdlock (iauth sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024594 7f25ec184700 20 mds.0.locker must rdlock (ilink sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024597 7f25ec184700 20 mds.0.locker must rdlock (ifile sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024600 7f25ec184700 20 mds.0.locker must rdlock (ixattr sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024602 7f25ec184700 20 mds.0.locker must rdlock (isnap sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024605 7f25ec184700 10 mds.0.locker must authpin [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024607 7f25ec184700 10 mds.0.locker auth_pinning [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 0x53ca968]
2019-03-08 19:30:12.024610 7f25ec184700 10 mds.0.cache.ino(1) auth_pin by 0x51e5e00 on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 authpin=1 0x53ca968] now 1+0
2019-03-08 19:30:12.024614 7f25ec184700 7 mds.0.locker rdlock_start on (isnap sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | request=1 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024618 7f25ec184700 10 mds.0.locker got rdlock on (isnap sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (isnap sync r=1) (iversion lock) | request=1 lock=1 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024621 7f25ec184700 7 mds.0.locker rdlock_start on (ifile sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (isnap sync r=1) (iversion lock) | request=1 lock=1 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024625 7f25ec184700 10 mds.0.locker got rdlock on (ifile sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=2 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024628 7f25ec184700 7 mds.0.locker rdlock_start on (iauth sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=2 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024631 7f25ec184700 10 mds.0.locker got rdlock on (iauth sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iauth sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=3 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024635 7f25ec184700 7 mds.0.locker rdlock_start on (ilink sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iauth sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=3 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024638 7f25ec184700 10 mds.0.locker got rdlock on (ilink sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=4 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024642 7f25ec184700 7 mds.0.locker rdlock_start on (ixattr sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=4 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024646 7f25ec184700 10 mds.0.locker got rdlock on (ixattr sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) (ixattr sync r=1) (iversion lock) | request=1 lock=5 dirfrag=1 authpin=1 0x53ca968]
2019-03-08 19:30:12.024658 7f25ec184700 10 mds.0.server reply to stat on client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 19:29:15.425510 RETRY=2) v2
2019-03-08 19:30:12.024661 7f25ec184700 10 mds.0.server reply_client_request 0 ((0) Success) client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 19:29:15.425510 RETRY=2) v2
2019-03-08 19:30:12.024673 7f25ec184700 10 mds.0.server apply_allocated_inos 0 / [] / 0
2019-03-08 19:30:12.024674 7f25ec184700 20 mds.0.server lat 0.060895
2019-03-08 19:30:12.024677 7f25ec184700 20 mds.0.server set_trace_dist snapid head
2019-03-08 19:30:12.024679 7f25ec184700 10 mds.0.server set_trace_dist snaprealm snaprealm(1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x53b8480) len=48
2019-03-08 19:30:12.024683 7f25ec184700 20 mds.0.cache.ino(1) pfile 0 pauth 0 plink 0 pxattr 0 plocal 0 ctime 2019-03-07 21:12:21.476328 valid=1
2019-03-08 19:30:12.024688 7f25ec184700 10 mds.0.cache.ino(1) add_client_cap first cap, joining realm snaprealm(1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x53b8480)
2019-03-08 19:30:12.026741 7f25ec184700 -1 *** Caught signal (Segmentation fault) **
in thread 7f25ec184700
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
1: ceph_mds() [0x89982a]
2: (()+0x10350) [0x7f25f4647350]
3: (CInode::get_caps_allowed_for_client(client_t) const+0x130) [0x7a19f0]
4: (CInode::encode_inodestat(ceph::buffer::list&, Session*, SnapRealm*, snapid_t, unsigned int, int)+0x132d) [0x7b383d]
5: (Server::set_trace_dist(Session*, MClientReply*, CInode*, CDentry*, snapid_t, int, std::tr1::shared_ptr<MDRequestImpl>&)+0x471) [0x5f26e1]
6: (Server::reply_client_request(std::tr1::shared_ptr<MDRequestImpl>&, MClientReply*)+0x846) [0x611056]
7: (Server::respond_to_request(std::tr1::shared_ptr<MDRequestImpl>&, int)+0x4d9) [0x611759]
8: (Server::handle_client_getattr(std::tr1::shared_ptr<MDRequestImpl>&, bool)+0x47b) [0x613eab]
9: (Server::dispatch_client_request(std::tr1::shared_ptr<MDRequestImpl>&)+0xa38) [0x633da8]
10: (Server::handle_client_request(MClientRequest*)+0x3df) [0x63435f]
11: (Server::dispatch(Message*)+0x3f3) [0x63b8b3]
12: (MDS::handle_deferrable_message(Message*)+0x847) [0x5b6c27]
13: (MDS::_dispatch(Message*)+0x6d) [0x5d2bed]
14: (C_MDS_RetryMessage::finish(int)+0x1b) [0x63d24b]
15: (MDSInternalContextBase::complete(int)+0x163) [0x7e3363]
16: (MDS::_advance_queues()+0x48d) [0x5c9e4d]
17: (MDS::ProgressThread::entry()+0x4a) [0x5ca1aa]
18: (()+0x8192) [0x7f25f463f192]
19: (clone()+0x6d) [0x7f25f3b4c26d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com