Segmentation fault due to lacking NULL-check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have done a fresh install of Ubuntu 13.04 where I installed Ceph 0.61.4 from the binary packages on ceph.com.

I then installed Qemu 1.4.2 and tried to start a VM with an image on RBD. Qemu segfaults with this backtrace:

#0 librbd::aio_flush (ictx=0x0, c=0x7f5f37ec6580) at librbd/internal.cc:2693 #1 0x00007f5f360e67d8 in rbd_aio_flush (image=<optimized out>, c=<optimized out>) at librbd/librbd.cc:1128 #2 0x00007f5f36d3bb5f in rbd_aio_flush_wrapper (comp=<optimized out>, image=<optimized out>) at block/rbd.c:665 #3 rbd_start_aio (bs=<optimized out>, sector_num=sector_num@entry=0, qiov=qiov@entry=0x0, nb_sectors=nb_sectors@entry=0, cb=<optimized out>, opaque=<optimized out>, cmd=cmd@entry=RBD_AIO_FLUSH) at block/rbd.c:736 #4 0x00007f5f36d3bc2c in qemu_rbd_aio_flush (bs=<optimized out>, cb=<optimized out>, opaque=<optimized out>) at block/rbd.c:782
#5  0x00007f5f36d1fdbb in bdrv_co_flush (bs=0x7f5f37e9e500) at block.c:4050
#6  bdrv_co_flush (bs=0x7f5f37e9e500) at block.c:4021
#7 0x00007f5f36d1fe00 in bdrv_flush_co_entry (opaque=0x7fffd5d21610) at block.c:4018 #8 0x00007f5f36d517da in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:138
#9  0x00007f5f329f44c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x00007fffd5d20e80 in ?? ()


Looking at the source code, I see that internal.cc does not check the ictx parameter before it dereferences it in line 2693:

      CephContext *cct = ictx->cct;

As ictx has value NULL, the segfaults happens.


Another thing is that I don't understand what could cause qemu to call this function with a NULL-parameter?

I have tested this also on another server running Ubuntu 12.10 and I see the exact same problem and backtrace.

By suggestion on IRC from joelio, I trying mapping the rbd with the kernel mapper. This failed with the following:

[23504.239286] libceph: mon1 10.0.0.2:6789 feature set mismatch, my 40002 < server's 2040002, missing 2000000
[23504.239852] libceph: mon1 10.0.0.2:6789 socket error on read

What could this be caused by?

I have 0.61.4 on the client, and all osds and mons are also running 0.61.4.


Thanks in advance,
--
Jens Kristian Søgaard, Mermaid Consulting ApS,
jens@xxxxxxxxxxxxxxxxxxxx,
http://www.mermaidconsulting.com/
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux