Hi HP.
Mine was not really a fix, it was just a hack to get the OSD up long enough to make sure I had a full backup, then I rebuilt the cluster from scratch and restored the data. Though the hack did stop the OSD from crashing, it is probably a symptom of some internal problem, and may not be "safe" to run like that in the long term.I changed this:
ObjectContextRef obc = get_object_context(oid, false); assert(obc); --ctx->delta_stats.num_objects; --ctx->delta_stats.num_objects_hit_set_archive; ctx->delta_stats.num_bytes -= obc->obs.oi.size; ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
ObjectContextRef obc = get_object_context(oid, false); assert(obc); --ctx->delta_stats.num_objects; --ctx->delta_stats.num_objects_hit_set_archive; ctx->delta_stats.num_bytes -= obc->obs.oi.size; ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
to this:
ObjectContextRef obc = 0; // get_object_context(oid, false); assert(obc); --ctx->delta_stats.num_objects; --ctx->delta_stats.num_objects_hit_set_archive;
if( obc)
{
ctx->delta_stats.num_bytes -= obc->obs.oi.size;
ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
}
ObjectContextRef obc = 0; // get_object_context(oid, false); assert(obc); --ctx->delta_stats.num_objects; --ctx->delta_stats.num_objects_hit_set_archive;
if( obc)
{
ctx->delta_stats.num_bytes -= obc->obs.oi.size;
ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
}
Good luck!
Blade.
On Sat, Aug 13, 2016 at 5:52 AM, Hein-Pieter van Braam <hp@xxxxxx> wrote:
Hi Blade,
I appear to be stuck in the same situation you were in. Do you still
happen to have a patch to implement this workaround you described?
Thanks,
- HP
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com