On Tue, 4 Sep 2012, Andrey Korolyov wrote: > Hi, > > Almost always one or more osd dies when doing overlapped recovery - > e.g. add new crushmap and remove some newly added osds from cluster > some minutes later during remap or inject two slightly different > crushmaps after a short time(surely preserving at least one of > replicas online). Seems that osd dying on excessive amount of > operations in queue because under normal test, e.g. rados, iowait does > not break one percent barrier but during recovery it may raise up to > ten percents(2108 w/ cache, splitted disks as R0 each). > > #0 0x00007f62f193a445 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00007f62f193db9b in abort () from /lib/x86_64-linux-gnu/libc.so.6 > #2 0x00007f62f2236665 in __gnu_cxx::__verbose_terminate_handler() () > from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #3 0x00007f62f2234796 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #4 0x00007f62f22347c3 in std::terminate() () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #5 0x00007f62f22349ee in __cxa_throw () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #6 0x0000000000844e11 in ceph::__ceph_assert_fail(char const*, char > const*, int, char const*) () > #7 0x000000000073148f in > FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, > int) () Can you install debug symbols to see what line number this is one (e.g. apt-get install ceph-dbg), or check in the log file to see what the assert failure is? Thanks! sage > #8 0x000000000073484e in > FileStore::do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long) () > #9 0x000000000070c680 in FileStore::_do_op(FileStore::OpSequencer*) () > #10 0x000000000083ce01 in ThreadPool::worker() () > #11 0x00000000006823ed in ThreadPool::WorkThread::entry() () > #12 0x00007f62f345ee9a in start_thread () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #13 0x00007f62f19f64cd in clone () from /lib/x86_64-linux-gnu/libc.so.6 > #14 0x0000000000000000 in ?? () > ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) > > On Sun, Aug 26, 2012 at 8:52 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: > > During recovery, following crash happens(simular to > > http://tracker.newdream.net/issues/2126 which marked resolved long > > ago): > > > > http://xdel.ru/downloads/ceph-log/osd-2012-08-26.txt > > > > On Sat, Aug 25, 2012 at 12:30 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: > >> On Thu, Aug 23, 2012 at 4:09 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > >>> The tcmalloc backtrace on the OSD suggests this may be unrelated, but > >>> what's the fd limit on your monitor process? You may be approaching > >>> that limit if you've got 500 OSDs and a similar number of clients. > >>> > >> > >> Thanks! I didn`t measured a # of connection because of bearing in mind > >> 1 conn per client, raising limit did the thing. Previously mentioned > >> qemu-kvm zombie does not related to rbd itself - it can be created by > >> destroying libvirt domain which is in saving state or vice-versa, so > >> I`ll put a workaround on this. Right now I am faced different problem > >> - osds dying silently, e.g. not leaving a core, I`ll check logs on the > >> next testing phase. > >> > >>> On Wed, Aug 22, 2012 at 6:55 PM, Andrey Korolyov <andrey@xxxxxxx> wrote: > >>>> On Thu, Aug 23, 2012 at 2:33 AM, Sage Weil <sage@xxxxxxxxxxx> wrote: > >>>>> On Thu, 23 Aug 2012, Andrey Korolyov wrote: > >>>>>> Hi, > >>>>>> > >>>>>> today during heavy test a pair of osds and one mon died, resulting to > >>>>>> hard lockup of some kvm processes - they went unresponsible and was > >>>>>> killed leaving zombie processes ([kvm] <defunct>). Entire cluster > >>>>>> contain sixteen osd on eight nodes and three mons, on first and last > >>>>>> node and on vm outside cluster. > >>>>>> > >>>>>> osd bt: > >>>>>> #0 0x00007fc37d490be3 in > >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, > >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 > >>>>>> (gdb) bt > >>>>>> #0 0x00007fc37d490be3 in > >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, > >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 > >>>>>> #1 0x00007fc37d490eb4 in tcmalloc::ThreadCache::Scavenge() () from > >>>>>> /usr/lib/libtcmalloc.so.4 > >>>>>> #2 0x00007fc37d4a2287 in tc_delete () from /usr/lib/libtcmalloc.so.4 > >>>>>> #3 0x00000000008b1224 in _M_dispose (__a=..., this=0x6266d80) at > >>>>>> /usr/include/c++/4.7/bits/basic_string.h:246 > >>>>>> #4 ~basic_string (this=0x7fc3736639d0, __in_chrg=<optimized out>) at > >>>>>> /usr/include/c++/4.7/bits/basic_string.h:536 > >>>>>> #5 ~basic_stringbuf (this=0x7fc373663988, __in_chrg=<optimized out>) > >>>>>> at /usr/include/c++/4.7/sstream:60 > >>>>>> #6 ~basic_ostringstream (this=0x7fc373663980, __in_chrg=<optimized > >>>>>> out>, __vtt_parm=<optimized out>) at /usr/include/c++/4.7/sstream:439 > >>>>>> #7 pretty_version_to_str () at common/version.cc:40 > >>>>>> #8 0x0000000000791630 in ceph::BackTrace::print (this=0x7fc373663d10, > >>>>>> out=...) at common/BackTrace.cc:19 > >>>>>> #9 0x000000000078f450 in handle_fatal_signal (signum=11) at > >>>>>> global/signal_handler.cc:91 > >>>>>> #10 <signal handler called> > >>>>>> #11 0x00007fc37d490be3 in > >>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, > >>>>>> unsigned long, int) () from /usr/lib/libtcmalloc.so.4 > >>>>>> #12 0x00007fc37d490eb4 in tcmalloc::ThreadCache::Scavenge() () from > >>>>>> /usr/lib/libtcmalloc.so.4 > >>>>>> #13 0x00007fc37d49eb97 in tc_free () from /usr/lib/libtcmalloc.so.4 > >>>>>> #14 0x00007fc37d1c6670 in __gnu_cxx::__verbose_terminate_handler() () > >>>>>> from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > >>>>>> #15 0x00007fc37d1c4796 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > >>>>>> #16 0x00007fc37d1c47c3 in std::terminate() () from > >>>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > >>>>>> #17 0x00007fc37d1c49ee in __cxa_throw () from > >>>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > >>>>>> #18 0x0000000000844e11 in ceph::__ceph_assert_fail (assertion=0x90c01c > >>>>>> "0 == \"unexpected error\"", file=<optimized out>, line=3007, > >>>>>> func=0x90ef80 "unsigned int > >>>>>> FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)") > >>>>>> at common/assert.cc:77 > >>>>> > >>>>> This means it got an unexpected error when talking to the file system. If > >>>>> you look in the osd log, it may tell you what that was. (It may > >>>>> not--there isn't usually the other tcmalloc stuff triggered from the > >>>>> assert handler.) > >>>>> > >>>>> What happens if you restart that ceph-osd daemon? > >>>>> > >>>>> sage > >>>>> > >>>>> > >>>> > >>>> Unfortunately I have completely disabled logs during test, so there > >>>> are no suggestion of assert_fail. The main problem was revealed - > >>>> created VMs was pointed to one monitor instead set of three, so there > >>>> may be some unusual things(btw, crashed mon isn`t one from above, but > >>>> a neighbor of crashed osds on first node). After IPMI reset node > >>>> returns back well and cluster behavior seems to be okay - stuck kvm > >>>> I/O somehow prevented even other module load|unload on this node, so I > >>>> finally decided to do hard reset. Despite I`m using almost generic > >>>> wheezy, glibc was updated to 2.15, may be because of this my trace > >>>> appears first time ever. I`m almost sure that fs does not triggered > >>>> this crash and mainly suspecting stuck kvm processes. I`ll rerun test > >>>> with same conditions tomorrow(~500 vms pointed to one mon and very > >>>> high I/O, but with osd logging). > >>>> > >>>>>> #19 0x000000000073148f in FileStore::_do_transaction > >>>>>> (this=this@entry=0x2cde000, t=..., op_seq=op_seq@entry=429545, > >>>>>> trans_num=trans_num@entry=0) at os/FileStore.cc:3007 > >>>>>> #20 0x000000000073484e in FileStore::do_transactions (this=0x2cde000, > >>>>>> tls=..., op_seq=429545) at os/FileStore.cc:2436 > >>>>>> #21 0x000000000070c680 in FileStore::_do_op (this=0x2cde000, > >>>>>> osr=<optimized out>) at os/FileStore.cc:2259 > >>>>>> #22 0x000000000083ce01 in ThreadPool::worker (this=0x2cde828) at > >>>>>> common/WorkQueue.cc:54 > >>>>>> #23 0x00000000006823ed in ThreadPool::WorkThread::entry > >>>>>> (this=<optimized out>) at ./common/WorkQueue.h:126 > >>>>>> #24 0x00007fc37e3eee9a in start_thread () from > >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 > >>>>>> #25 0x00007fc37c9864cd in clone () from /lib/x86_64-linux-gnu/libc.so.6 > >>>>>> #26 0x0000000000000000 in ?? () > >>>>>> > >>>>>> mon bt was exactly the same as in http://tracker.newdream.net/issues/2762 > >>>>>> -- > >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx > >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > >>>>>> > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html