在 12/7/2019 3:56 PM, yangjun@xxxxxxxxxxxxxxxxxxxx 写道:
Hi dongsheng
https://tracker.ceph.com/issues/43178,; Can I open a PR to fix it like
you said?Thanks.
Sure, feel free to open a PR to fix it.
------------------------------------------------------------------------
yangjun@xxxxxxxxxxxxxxxxxxxx
*From:* Jason Dillaman <mailto:jdillama@xxxxxxxxxx>
*Date:* 2019-12-06 22:58
*To:* Dongsheng Yang <mailto:dongsheng.yang@xxxxxxxxxxxx>
*CC:* yangjun@xxxxxxxxxxxxxxxxxxxx
<mailto:yangjun@xxxxxxxxxxxxxxxxxxxx>; ceph-users
<mailto:ceph-users@xxxxxxx>
*Subject:* Re: rbd_open_by_id crash when connection timeout
On Fri, Dec 6, 2019 at 9:51 AM Dongsheng Yang
<dongsheng.yang@xxxxxxxxxxxx> wrote:
>
>
> 在 12/6/2019 9:46 PM, Jason Dillaman 写道:
> > On Fri, Dec 6, 2019 at 12:12 AM Dongsheng Yang
> > <dongsheng.yang@xxxxxxxxxxxx> wrote:
> >>
> >>
> >> On 12/06/2019 12:50 PM, yangjun@xxxxxxxxxxxxxxxxxxxx wrote:
> >>
> >> Hi Jason, dongsheng
> >> I found a problem using rbd_open_by_id when connection
timeout(errno = 110, ceph version 12.2.8, there is no change
about rbd_open_by_id in master branch).
> >> int r = ictx->state->open(false);
> >> if (r < 0) { // r = -110
> >> delete ictx; // crash,the stack is shown below:
> >> } else {
> >> *image = (rbd_image_t)ictx;
> >> }
> >> Stack is :
> >> (gdb) bt
> >> Program terminated with signal 11, Segmentation fault.
> >> #0 Mutex::Lock (this=this@entry=0x8,
no_lockdep=no_lockdep@entry=false)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.cc:92
> >> #1 0x0000ffff765eb814 in Locker (m=..., this=<synthetic
pointer>)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.h:115
> >> #2 PerfCountersCollection::remove (this=0x0,
l=0xffff85bf0588 <main_arena+88>)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/src/common/perf_counters.cc:62
> >> #3 0x0000ffff757f26c4 in librbd::ImageCtx::perf_stop
(this=0x2065d2f0)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:426
> >> #4 0x0000ffff757f6e48 in librbd::ImageCtx::~ImageCtx
(this=0x2065d2f0, __in_chrg=<optimized out>)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:239
> >> #5 0x0000ffff757e8e70 in rbd_open_by_id (p=p@entry=0x2065e780,
> >> id=id@entry=0xffff85928bb4 "20376aa706c0",
image=image@entry=0xffff75b9b0b8,
> >> snap_name=snap_name@entry=0x0) at
/var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/librbd.cc:2692
> >> #6 0x0000ffff75aed524 in __pyx_pf_3rbd_5Image___init__ (
> >> __pyx_v_read_only=0xffff85f02bb8 <_Py_ZeroStruct>,
__pyx_v_snapshot=0xffff85f15938 <_Py_NoneStruct>,
> >> __pyx_v_by_id=0xffff85f02ba0 <_Py_TrueStruct>,
__pyx_v_name=0xffff85928b90,
> >> __pyx_v_ioctx=0xffff7f0283d0, __pyx_v_self=0xffff75b9b0a8)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12662
> >> #7 __pyx_pw_3rbd_5Image_1__init__
(__pyx_v_self=0xffff75b9b0a8, __pyx_args=<optimized out>,
> >> __pyx_kwds=<optimized out>)
> >> at
/var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12378
> >> #8 0x0000ffff85e02924 in type_call () from
/lib64/libpython2.7.so.1.0
> >> #9 0x0000ffff85dac254 in PyObject_Call () from
/lib64/libpython2.7.so.1.0
> > I suspect you will hit tons of errors and corruption and
leaking RADOS
> > objects when you start injecting RADOS timeouts. It's not
something
> > librbd really supports or tests against -- so you need to
expect the
> > unexpected. Definitely feel free to open a PR to fix it,
though -- but
> > otherwise it would be a really *low* priority to fix.
> >
> >> Meanwhile, I have sone questions about rbd_open, if
state->open failed(return value r < 0), Is there a risk of memory
leaks about ictx?
> >> extern "C" int rbd_open(rados_ioctx_t p, const char *name,
rbd_image_t *image,
> >> const char *snap_name)
> >> {
> >> librados::IoCtx io_ctx;
> >> librados::IoCtx::from_rados_ioctx_t(p, io_ctx);
> >>
TracepointProvider::initialize<tracepoint_traits>(get_cct(io_ctx));
> >> librbd::ImageCtx *ictx = new librbd::ImageCtx(name, "",
snap_name, io_ctx,
> >> false);
> >> tracepoint(librbd, open_image_enter, ictx,
ictx->name.c_str(), ictx->id.c_str(), ictx->snap_name.c_str(),
ictx->read_only);
> >>
> >> int r = ictx->state->open(false);
> >> if (r >= 0) { // if r < 0, Is there a risk of memory
leaks?
> >>
> >>
> >> Good catch. I think there is. And same at rbd_open_read_only.
Seems we need a patch to fix them.
> > No, that's not a memory leak. The blocking version of
> > "ImageState::open" will free the "ImageCtx" memory upon failure.
>
>
> My bad, I did not look it carefully before repling this mail,
>
> I realized that we delete ctx in rbd_open_by_id but not in
rbd_open, so
> I thought it's a memory leak
>
> in rbd_open. But I was wrong as you mentioned, it is released in
> state->open(). So we dont need to add
>
> a delete in rbd_open(), on the contrary, we need to remove the
"delete
> ictx" to avoid double-free problem as below:
>
>
> diff --git a/src/librbd/librbd.cc b/src/librbd/librbd.cc
> index 6c303c9..b518fad 100644
> --- a/src/librbd/librbd.cc
> +++ b/src/librbd/librbd.cc
> @@ -3924,9 +3924,7 @@ extern "C" int rbd_open_by_id(rados_ioctx_t p,
> const char *id,
> ictx->id.c_str(), ictx->snap_name.c_str(),
ictx->read_only);
>
> int r = ictx->state->open(0);
> - if (r < 0) {
> - delete ictx;
> - } else {
> + if (r >= 0) {
> *image = (rbd_image_t)ictx;
> }
> tracepoint(librbd, open_image_exit, r);
> @@ -3999,9 +3997,7 @@ extern "C" int
> rbd_open_by_id_read_only(rados_ioctx_t p, const char *id,
> ictx->id.c_str(), ictx->snap_name.c_str(),
ictx->read_only);
>
> int r = ictx->state->open(0);
> - if (r < 0) {
> - delete ictx;
> - } else {
> + if (r >= 0) {
> *image = (rbd_image_t)ictx;
> }
>
> tracepoint(librbd, open_image_exit, r);
>
>
> There is a reproduce test.c attached.
Opened https://tracker.ceph.com/issues/43178
> Thanx
>
> Yang
>
> >
> >> Thanx
> >> Yang
> >>
> >> *image = (rbd_image_t)ictx;
> >> }
> >> tracepoint(librbd, open_image_exit, r);
> >> return r;
> >> }
> >>
> >> Can you give me some advice? many thanks.
> >>
> >>
> >> ________________________________
> >> yangjun@xxxxxxxxxxxxxxxxxxxx
> >>
> >>
> >
Thanks,
--
Jason
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx