mds servers in endless segfault loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, ceph-users.


Our mds servers keep segfaulting from a failed assertion, and for the first time I can't find anyone else who's posted about this problem. None of them are able to stay up, so our cephfs is down.


We recently had to truncate the journal log after an upgrade to nautilus, and now we have lots of dup inodes, failed to open inode, and badness: got (but i already had) messages in the recent event dump, if that's relevant. I don't know which parts of that are going to be the most relevant, but here are the last ten:


  -10> 2019-10-11 03:30:35.258 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1843c err -22/0
    -9> 2019-10-11 03:30:35.260 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1843c err -22/0
    -8> 2019-10-11 03:30:35.260 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1843d err -22/-22
    -7> 2019-10-11 03:30:35.260 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1843e err -22/-22
    -6> 2019-10-11 03:30:35.261 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1843f err -22/-22
    -5> 2019-10-11 03:30:35.261 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1845a err -22/-22
    -4> 2019-10-11 03:30:35.262 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1845e err -22/-22
    -3> 2019-10-11 03:30:35.262 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a1846f err -22/-22
    -2> 2019-10-11 03:30:35.263 7fd080a69700  0 mds.0.cache  failed to open ino 0x10000a18470 err -22/-22
    -1> 2019-10-11 03:30:35.273 7fd080a69700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.4/rpm/el7/BUILD/ceph-14.2.4/src/mds/CInode.cc: In function 'CDir* CInode::get_or_open_dirfrag(MDCache*, frag_t)' thread 7fd080a69700 time 2019-10-11 03:30:35.273849

I'm happy to provide any other information that would help diagnose the issue. I don't have any guesses about what else would be helpful, though.


Thanks in advance for any help!



Neale Pickett <neale@xxxxxxxx>
A-4: Advanced Research in Cyber Systems
Los Alamos National Laboratory
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux