Hi, Almost always one or more osd dies when doing overlapped recovery - e.g. add new crushmap and remove some newly added osds from cluster some minutes later during remap or inject two slightly different crushmaps after a short time(surely preserving at least one of replicas online). Seems that osd dying on excessive amount of operations in queue because under normal test, e.g. rados, iowait does not break one percent barrier but during recovery it may raise up to ten percents(2108 w/ cache, splitted disks as R0 each). #0 0x00007f62f193a445 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f62f193db9b in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007f62f2236665 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #3 0x00007f62f2234796 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x00007f62f22347c3 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #5 0x00007f62f22349ee in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #6 0x0000000000844e11 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () #7 0x000000000073148f in FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int) () #8 0x000000000073484e in FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long) () #9 0x000000000070c680 in FileStore::_do_op(FileStore::OpSequencer*) () #10 0x000000000083ce01 in ThreadPool::worker() () #11 0x00000000006823ed in ThreadPool::WorkThread::entry() () #12 0x00007f62f345ee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #13 0x00007f62f19f64cd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #14 0x0000000000000000 in ?? () ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) On Sun, Aug 26, 2012 at 8:52 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: > During recovery, following crash happens(simular to > http://tracker.newdream.net/issues/2126 which marked resolved long > ago): > > http://xdel.ru/downloads/ceph-log/osd-2012-08-26.txt > > On Sat, Aug 25, 2012 at 12:30 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: >> On Thu, Aug 23, 2012 at 4:09 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> The tcmalloc backtrace on the OSD suggests this may be unrelated, but >>> what's the fd limit on your monitor process? You may be approaching >>> that limit if you've got 500 OSDs and a similar number of clients. >>> >> >> Thanks! I didn`t measured a # of connection because of bearing in mind >> 1 conn per client, raising limit did the thing. Previously mentioned >> qemu-kvm zombie does not related to rbd itself - it can be created by >> destroying libvirt domain which is in saving state or vice-versa, so >> I`ll put a workaround on this. Right now I am faced different problem >> - osds dying silently, e.g. not leaving a core, I`ll check logs on the >> next testing phase. >> >>> On Wed, Aug 22, 2012 at 6:55 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: >>>> On Thu, Aug 23, 2012 at 2:33 AM, Sage Weil <sage@xxxxxxxxxxx> wrote: >>>>> On Thu, 23 Aug 2012, Andrey Korolyov wrote: >>>>>> Hi, >>>>>> >>>>>> today during heavy test a pair of osds and one mon died, resulting to >>>>>> hard lockup of some kvm processes - they went unresponsible and was >>>>>> killed leaving zombie processes ([kvm] <defunct>). Entire cluster >>>>>> contain sixteen osd on eight nodes and three mons, on first and last >>>>>> node and on vm outside cluster. >>>>>> >>>>>> osd bt: >>>>>> #0 0x00007fc37d490be3 in >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 >>>>>> (gdb) bt >>>>>> #0 0x00007fc37d490be3 in >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 >>>>>> #1 0x00007fc37d490eb4 in tcmalloc::ThreadCache::Scavenge() () from >>>>>> /usr/lib/libtcmalloc.so.4 >>>>>> #2 0x00007fc37d4a2287 in tc_delete () from /usr/lib/libtcmalloc.so.4 >>>>>> #3 0x00000000008b1224 in _M_dispose (__a=..., this=0x6266d80) at >>>>>> /usr/include/c++/4.7/bits/basic_string.h:246 >>>>>> #4 ~basic_string (this=0x7fc3736639d0, __in_chrg=<optimized out>) at >>>>>> /usr/include/c++/4.7/bits/basic_string.h:536 >>>>>> #5 ~basic_stringbuf (this=0x7fc373663988, __in_chrg=<optimized out>) >>>>>> at /usr/include/c++/4.7/sstream:60 >>>>>> #6 ~basic_ostringstream (this=0x7fc373663980, __in_chrg=<optimized >>>>>> out>, __vtt_parm=<optimized out>) at /usr/include/c++/4.7/sstream:439 >>>>>> #7 pretty_version_to_str () at common/version.cc:40 >>>>>> #8 0x0000000000791630 in ceph::BackTrace::print (this=0x7fc373663d10, >>>>>> out=...) at common/BackTrace.cc:19 >>>>>> #9 0x000000000078f450 in handle_fatal_signal (signum=11) at >>>>>> global/signal_handler.cc:91 >>>>>> #10 <signal handler called> >>>>>> #11 0x00007fc37d490be3 in >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 >>>>>> #12 0x00007fc37d490eb4 in tcmalloc::ThreadCache::Scavenge() () from >>>>>> /usr/lib/libtcmalloc.so.4 >>>>>> #13 0x00007fc37d49eb97 in tc_free () from /usr/lib/libtcmalloc.so.4 >>>>>> #14 0x00007fc37d1c6670 in __gnu_cxx::__verbose_terminate_handler() () >>>>>> from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>>>> #15 0x00007fc37d1c4796 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>>>> #16 0x00007fc37d1c47c3 in std::terminate() () from >>>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>>>> #17 0x00007fc37d1c49ee in __cxa_throw () from >>>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 >>>>>> #18 0x0000000000844e11 in ceph::__ceph_assert_fail (assertion=0x90c01c >>>>>> "0 == \"unexpected error\"", file=<optimized out>, line=3007, >>>>>> func=0x90ef80 "unsigned int >>>>>> FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)") >>>>>> at common/assert.cc:77 >>>>> >>>>> This means it got an unexpected error when talking to the file system. If >>>>> you look in the osd log, it may tell you what that was. (It may >>>>> not--there isn't usually the other tcmalloc stuff triggered from the >>>>> assert handler.) >>>>> >>>>> What happens if you restart that ceph-osd daemon? >>>>> >>>>> sage >>>>> >>>>> >>>> >>>> Unfortunately I have completely disabled logs during test, so there >>>> are no suggestion of assert_fail. The main problem was revealed - >>>> created VMs was pointed to one monitor instead set of three, so there >>>> may be some unusual things(btw, crashed mon isn`t one from above, but >>>> a neighbor of crashed osds on first node). After IPMI reset node >>>> returns back well and cluster behavior seems to be okay - stuck kvm >>>> I/O somehow prevented even other module load|unload on this node, so I >>>> finally decided to do hard reset. Despite I`m using almost generic >>>> wheezy, glibc was updated to 2.15, may be because of this my trace >>>> appears first time ever. I`m almost sure that fs does not triggered >>>> this crash and mainly suspecting stuck kvm processes. I`ll rerun test >>>> with same conditions tomorrow(~500 vms pointed to one mon and very >>>> high I/O, but with osd logging). >>>> >>>>>> #19 0x000000000073148f in FileStore::_do_transaction >>>>>> (this=this@entry=0x2cde000, t=..., op_seq=op_seq@entry=429545, >>>>>> trans_num=trans_num@entry=0) at os/FileStore.cc:3007 >>>>>> #20 0x000000000073484e in FileStore::do_transactions (this=0x2cde000, >>>>>> tls=..., op_seq=429545) at os/FileStore.cc:2436 >>>>>> #21 0x000000000070c680 in FileStore::_do_op (this=0x2cde000, >>>>>> osr=<optimized out>) at os/FileStore.cc:2259 >>>>>> #22 0x000000000083ce01 in ThreadPool::worker (this=0x2cde828) at >>>>>> common/WorkQueue.cc:54 >>>>>> #23 0x00000000006823ed in ThreadPool::WorkThread::entry >>>>>> (this=<optimized out>) at ./common/WorkQueue.h:126 >>>>>> #24 0x00007fc37e3eee9a in start_thread () from >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>> #25 0x00007fc37c9864cd in clone () from /lib/x86_64-linux-gnu/libc.so.6 >>>>>> #26 0x0000000000000000 in ?? () >>>>>> >>>>>> mon bt was exactly the same as in http://tracker.newdream.net/issues/2762 >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html