On Sun, May 11, 2014 at 7:11 AM, Alex Elder <elder@xxxxxxxx> wrote: > On 05/10/2014 05:18 PM, Hannes Landeholm wrote: >> Hello, >> >> I have a development machine that I have been running stress tests on >> for a week as I'm trying to reproduce some hard to reproduce failures. >> I've mentioned the same machine previously in the thread "rbd unmap >> deadlock". I just now noticed that some processes had completely >> stalled. I looked in the system log and saw this crash about 9 hours >> ago: > > Are you still running kernel rbd as a client of ceph > services running on the same physical machine? > > I personally believe that scenario may be at risk of > deadlock in any case--we haven't taken great care to > avoid it in this case. > > Anyway... > > I can build v3.14.1 but I don't know what kernel configuration > you are using. Knowing that could be helpful. I built it using > a config I have though, and it's *possible* you crashed on > this line, in rbd_segment_name(): > ret = snprintf(name, CEPH_MAX_OID_NAME_LEN + 1, name_format, > rbd_dev->header.object_prefix, segment); > And if so, the only reason I can think that this failed is if > rbd_dev->header.object_prefix were null (or an otherwise bad > pointer value). But at this point it's a lot of speculation. More precisely, it crashed on segment = offset >> rbd_dev->header.obj_order; while loading obj_order. rbd_dev is ffff87ff3fbcdc00, which suggests a use after free of some sort. (This is the first rbd_dev deref after grabbing it from img_request at the top of rbd_img_request_fill(), which got it from request_queue::queuedata in rbd_request_fn().) Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html