Re: crash report: paging request errors in various krbd contexts

Hannes Landeholm <hannes@xxxxxxxxxxxxxx> · Fri, 16 May 2014 17:13:03 +0200

> The bottom line is that it appears that some
> memory used by rbd and/or libceph has become
> corrupted, or there is something (or more than
> one thing) that is being used after it's been
> freed.  Either way this sort of thing will be
> difficult to try to understand; it would be
> great if it could be reproduced independently.
>
> We're calling strnlen() (ultimately) from snprintf().  The
> format provided will be "%s.%012llx" (or similar).  The
> string provided for the %s is rbd_dev->header.object_prefix,
> which is a dynamically allocated string initialized once
> for the rbd device, which will be NUL-terminated and
> unchanging until the device gets mapped.
>
> Either the rbd device got unmapped while still
> in use, or the memory holding this rbd_dev structure
> got corrupted somehow.

Yes, with my limited knowledge of the kernel I would have guessed that
it was some form of memory allocation problem as well as it crashed in
wildly different contexts and it crashed right after a memory
allocation in the snprintf() case.

Is it possible to configure the kernel when building it so it sanity
checks memory allocations that are free'd and/or reserved? I have
implemented my own free list based VM in userspace and I find it very
useful to insert a header with a magic canary value that I set before
giving out memory and check when I get memory back. This allows me to
crash with the offending code in the backtrace instead of crashing in
a wildly different context.

> I don't know if you've supplied this before, but can
> you describe the way the rbd device(s) in question
> is configured?  How many devices, how big are they,
> and *especially*, are they using layering and if so
> what the relationships are between them.

It's something like ~100-200 mappings that are 10 gb each. They use
layering and generally share the same parent with varying distance to
the common ancestor snapshot, but it's unlikely to be more than ~20
layers at the moment. More than 75% probably share the same common
ancestor. We don't have rbd caching enabled.

Thank you for you time,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html