Hi Sage, I'm sorry, I was busy with some other things, but I was able to look at this now: Now I can confirm that the problem is related to a missing osd, as I had to stop one of the osds to reproduce it. When I split up the two asserts, the error occurs in: ./osd/OSDMap.h:460: FAILED assert(exists(osd)) and here is the gdb backtrace: #0 0x0000003c0c0329b5 in raise () from /lib64/libc.so.6 #1 0x0000003c0c034195 in abort () from /lib64/libc.so.6 #2 0x0000003c110beaad in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6 #3 0x0000003c110bcc36 in ?? () from /usr/lib64/libstdc++.so.6 #4 0x0000003c110bcc63 in std::terminate() () from /usr/lib64/libstdc++.so.6 #5 0x0000003c110bcd5e in __cxa_throw () from /usr/lib64/libstdc++.so.6 #6 0x00007ffc36df6136 in ceph::__ceph_assert_fail ( assertion=0x7ffc36e22c59 "exists(osd)", file=<value optimized out>, line=460, func=<value optimized out>) at common/assert.cc:30 #7 0x00007ffc36d8014f in OSDMap::get_inst (this=<value optimized out>, osd=<value optimized out>) at osd/OSDMap.h:460 #8 0x00007ffc36d7b4c2 in Objecter::op_submit (this=0x159a4a0, op=0x7ffb29e264c0) at osdc/Objecter.cc:461 #9 0x00007ffc36d4bdc9 in Objecter::write (this=0x159a4a0, oid=..., ol=..., off=<value optimized out>, len=1638400, snapc=..., bl=..., mtime=..., onack=0x7ffc00811d20, oncommit=0x7ffc000461a0, flags=0) at osdc/Objecter.h:606 #10 0x00007ffc36d4d24b in RadosClient::aio_write (this=0x15916e0, pool=..., oid=..., off=917504, bl=..., len=1638400, c=0x7ffc03d4c010) at librados.cc:949 #11 0x00007ffc36d4d41a in rados_aio_write (pool=0x159a000, o=<value optimized out>, off=917504, buf=<value optimized out>, len=1638400, completion=0x7ffc03d4c010) at librados.cc:2119 #12 0x000000000045a305 in rbd_aio_rw_vector (bs=<value optimized out>, sector_num=<value optimized out>, qiov=<value optimized out>, nb_sectors=917504, cb=<value optimized out>, opaque=<value optimized out>, write=1) at block/rbd.c:769 #13 0x000000000045a430 in rbd_aio_writev (bs=<value optimized out>, sector_num=<value optimized out>, qiov=<value optimized out>, nb_sectors=<value optimized out>, cb=<value optimized out>, opaque=<value optimized out>) at block/rbd.c:802 #14 0x000000000043bb73 in bdrv_aio_writev (bs=0x159dd20, sector_num=2279168, qiov=0x7ffb29e26480, nb_sectors=3200, cb=<value optimized out>, opaque=<value optimized out>) at block.c:2019 #15 0x000000000043bb73 in bdrv_aio_writev (bs=0x159d3f0, sector_num=2279168, qiov=0x7ffb29e26480, nb_sectors=3200, cb=<value optimized out>, opaque=<value optimized out>) at block.c:2019 #16 0x000000000043ca2c in bdrv_aio_multiwrite (bs=0x159d3f0, reqs=0x7ffc0f5fd690, num_reqs=<value optimized out>) at block.c:2228 #17 0x000000000041cca5 in virtio_submit_multiwrite (bs=<value optimized out>, mrb=0x7ffc0f5fd690) at /usr/src/debug/qemu-kvm-0.13.0/hw/virtio-blk.c:241 #18 0x000000000041d30c in virtio_blk_handle_output (vdev=0x1e56c30, vq=<value optimized out>) at /usr/src/debug/qemu-kvm-0.13.0/hw/virtio-blk.c:359 #19 0x000000000042dc5d in kvm_handle_io (env=0x15c4b20) at /usr/src/debug/qemu-kvm-0.13.0/kvm-all.c:763 #20 kvm_run (env=0x15c4b20) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:645 #21 0x000000000042dd89 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1238 #22 0x000000000042f181 in kvm_main_loop_cpu (_env=0x15c4b20) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1495 #23 ap_main_loop (_env=0x15c4b20) at /usr/src/debug/qemu-kvm-0.13.0/qemu-kvm.c:1541 #24 0x0000003c0c4077e1 in start_thread () from /lib64/libpthread.so.0 #25 0x0000003c0c0e151d in clone () from /lib64/libc.so.6 I hope this helps. Regards, Christian 2010/11/4 Sage Weil <sage@xxxxxxxxxxxx>: > Hi Christian, > > On Tue, 26 Oct 2010, Christian Brunner wrote: >> I can't promise this for tomorrow, but I think I can do this on Thursday. > > Have you had a chance to look into this one at all? > > Thanks- > sage > > >> >> Christian >> >> -----Ursprüngliche Nachricht----- >> Von: Sage Weil [mailto:sage@xxxxxxxxxxxx] >> Gesendet: Dienstag, 26. Oktober 2010 21:09 >> An: Christian Brunner >> Cc: ceph-devel@xxxxxxxxxxxxxxx >> Betreff: Re: ./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd)) >> >> On Tue, 26 Oct 2010, Christian Brunner wrote: >> > When accessing multiple RBD-Volumes from one VM in parallel, we are >> > receiving an assertion: >> > >> > ./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)': >> > ./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd)) >> >> Can you change that into two asserts so we can see which it is? >> >> And/or can you gdb this and see what the value of osd is here? >> >> Thanks! >> sage >> >> >> > ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279) >> > 1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2] >> > 2: /usr/lib64/librados.so.1() [0x3865855dc9] >> > 3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long, >> > ceph::buffer::list const&, unsigned long, >> > RadosClient::AioCompletion*)+0x24b) [0x386585724b] >> > 4: (rados_aio_write()+0x9a) [0x386585741a] >> > 5: /usr/bin/qemu-kvm() [0x45a305] >> > 6: /usr/bin/qemu-kvm() [0x45a430] >> > 7: /usr/bin/qemu-kvm() [0x43bb73] >> > NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> > needed to interpret this. >> > ./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)': >> > ./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd)) ceph >> > version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279) >> > 1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2] >> > 2: /usr/lib64/librados.so.1() [0x3865855dc9] >> > 3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long, >> > ceph::buffer::list const&, unsigned long, >> > RadosClient::AioCompletion*)+0x24b) [0x386585724b] >> > 4: (rados_aio_write()+0x9a) [0x386585741a] >> > 5: /usr/bin/qemu-kvm() [0x45a305] >> > 6: /usr/bin/qemu-kvm() [0x45a430] >> > 7: /usr/bin/qemu-kvm() [0x43bb73] >> > NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> > needed to interpret this. >> > terminate called after throwing an instance of 'ceph::FailedAssertion' >> > *** Caught signal (ABRT) *** >> > ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279) >> > 1: (sigabrt_handler(int)+0x91) [0x3865922b91] >> > 2: /lib64/libc.so.6() [0x3c0c032a30] >> > 3: (gsignal()+0x35) [0x3c0c0329b5] >> > 4: (abort()+0x175) [0x3c0c034195] >> > 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3c110beaad] >> > >> > This is reproducible by doing the following inside a VM: >> > >> > # mkfs.btrfs /dev/vdb /dev/vdc /dev/vdd /dev/vde # mount /dev/vdb /mnt >> > # cd /mnt # bonnie++ -u root -d /mnt -f >> > >> > Any hints are welcome... >> > >> > Thanks, >> > >> > Christian >> > -- >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" >> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo >> > info at http://vger.kernel.org/majordomo-info.html >> > >> > >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html