mark_unfound_lost delete not deleting unfound objects

Brian Andrus <brian.andrus@xxxxxxxxxxxxx> · Mon, 21 Jun 2021 15:39:05 -0700

I am wondering if anyone has experience with the mark_unfound_lost delete
command seemingly not doing what it is supposed to, or if perhaps I have
unreasonable expectations about its function.

We have a EC pool making up a rgw data pool, and we have had a data loss
scenario. I've attempted to manually recover shards from all listed peers
in might_have_unfound, and had some success, but after extensive searching,
I believe the time has come to let go of the data we are still missing in
hopes of getting the cluster back to healthy and restoring service
functionality.

When I run "ceph pg 21.258e mark_unfound_lost delete", the command runs for
some time, until a few minutes in the primary OSD drops out of the cluster
but is still running. The logs would suggest this is because it is doing
some intensive iterative operations and is unresponsive to other OSDs.
Given we have tens of thousands of objects being marked lost, it would make
sense this might take some time... but in the meantime, the OSD is marked
out, another OSD takes its place, and the number of unfound objects for the
PG increases over the next few hours back to the original amount. It seems
so far, the primary OSD has not come back in every time I've tried this
operation.
My initial reaction was to restart the OSD when it dropped from the cluster
(and its PG went DOWN state) in an attempt to keep the RGW functioning, but
I realize that could have been counterproductive once I observed the logs
of the primary iterating over objects. Yet even leaving the OSD to complete
the iterative process, it doesn't seem to rejoin cluster without an
intervention in the form of daemon restart.

I'm wondering if anyone has experience deleting unfound objects at this
scale, and if it is an asynchronous operation that eventually completes, or
if we are encountering some unexpected behavior that warrants a bug report?
I am also wondering if ceph-objectstore-tool might be employed to work on
all shards of the PG at once and just start them back up together, minus
the unfound objects? I haven't seen much useful documented use of the
"fix-lost" operation, so I have hesitated to try it without a full
understanding of what it does.

Thank you to anyone who might be able to provide some information.
-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.andrus@xxxxxxxxxxxxx | www.dreamhost.com
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx