Re: crash in rbd_img_request_create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/11/2014 04:33 AM, Ilya Dryomov wrote:
> On Sun, May 11, 2014 at 7:11 AM, Alex Elder <elder@xxxxxxxx> wrote:
>> On 05/10/2014 05:18 PM, Hannes Landeholm wrote:
>>> Hello,
>>>
>>> I have a development machine that I have been running stress tests on
>>> for a week as I'm trying to reproduce some hard to reproduce failures.
>>> I've mentioned the same machine previously in the thread "rbd unmap
>>> deadlock". I just now noticed that some processes had completely
>>> stalled. I looked in the system log and saw this crash about 9 hours
>>> ago:
>>
>> Are you still running kernel rbd as a client of ceph
>> services running on the same physical machine?
>>
>> I personally believe that scenario may be at risk of
>> deadlock in any case--we haven't taken great care to
>> avoid it in this case.
>>
>> Anyway...
>>
>> I can build v3.14.1 but I don't know what kernel configuration
>> you are using.  Knowing that could be helpful.  I built it using
>> a config I have though, and it's *possible* you crashed on
>> this line, in rbd_segment_name():
>>         ret = snprintf(name, CEPH_MAX_OID_NAME_LEN + 1, name_format,
>>                         rbd_dev->header.object_prefix, segment);
>> And if so, the only reason I can think that this failed is if
>> rbd_dev->header.object_prefix were null (or an otherwise bad
>> pointer value).  But at this point it's a lot of speculation.
> 
> More precisely, it crashed on
> 
> segment = offset >> rbd_dev->header.obj_order;

After looking more closely at this tonight I can say I concur.

kernel: BUG: unable to handle kernel paging request at ffff87ff3fbcdc58
RAX: ffff87ff3fbcdc00

    2483:       00 00 00 be             movzbl 0x58(%rax),%ecx

Unfortunately that's about all I can say right now.

Since the stack includes rbd_request_fn() we know it's a
request that came from the block layer--which means that
the rbd_img_request_create() call was not being done for
a parent image request.  On the other hand, if you're right
about use-after-free, it could still involve an image request
created through that path through the code (if a parent image
request were freed while it was still in use).

Hannes indicated layered images were involved.

More later...

					-Alex

> while loading obj_order.  rbd_dev is ffff87ff3fbcdc00, which suggests
> a use after free of some sort.  (This is the first rbd_dev deref after
> grabbing it from img_request at the top of rbd_img_request_fill(),
> which got it from request_queue::queuedata in rbd_request_fn().)
> 
> Thanks,
> 
>                 Ilya
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux