Re: crash in rbd_img_request_create

Ilya Dryomov <ilya.dryomov@xxxxxxxxxxx> · Sun, 11 May 2014 13:33:47 +0400

On Sun, May 11, 2014 at 7:11 AM, Alex Elder <elder@xxxxxxxx> wrote:
> On 05/10/2014 05:18 PM, Hannes Landeholm wrote:
>> Hello,
>>
>> I have a development machine that I have been running stress tests on
>> for a week as I'm trying to reproduce some hard to reproduce failures.
>> I've mentioned the same machine previously in the thread "rbd unmap
>> deadlock". I just now noticed that some processes had completely
>> stalled. I looked in the system log and saw this crash about 9 hours
>> ago:
>
> Are you still running kernel rbd as a client of ceph
> services running on the same physical machine?
>
> I personally believe that scenario may be at risk of
> deadlock in any case--we haven't taken great care to
> avoid it in this case.
>
> Anyway...
>
> I can build v3.14.1 but I don't know what kernel configuration
> you are using.  Knowing that could be helpful.  I built it using
> a config I have though, and it's *possible* you crashed on
> this line, in rbd_segment_name():
>         ret = snprintf(name, CEPH_MAX_OID_NAME_LEN + 1, name_format,
>                         rbd_dev->header.object_prefix, segment);
> And if so, the only reason I can think that this failed is if
> rbd_dev->header.object_prefix were null (or an otherwise bad
> pointer value).  But at this point it's a lot of speculation.

More precisely, it crashed on

segment = offset >> rbd_dev->header.obj_order;

while loading obj_order.  rbd_dev is ffff87ff3fbcdc00, which suggests
a use after free of some sort.  (This is the first rbd_dev deref after
grabbing it from img_request at the top of rbd_img_request_fill(),
which got it from request_queue::queuedata in rbd_request_fn().)

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html