GCed (as in tail objects already deleted from the data pool) objects remain in the GC queue forever

Jaka Močnik <jaka@xxxxxxxxx> · Wed, 24 Nov 2021 11:40:40 +0100

hi,

running an octopus cluster (upgraded from nautilus a few months ago) of
some 0.5PB capacity. it is used exclusively as an object storage via
rgw (clients use the swift API), 6 rgw instances are used to cater to
this. the cluster has been running for a bit over two years.

it is subject to quite a heavy delete load (think in the order of
magnitude of 1M deletes per day).

until recently this was handled w/o any problems, however, some 10 days
ago, our monitoring alerted us that the rgw gc queue was holding some
20k rgw objects dispersed over ~700k rados objects. while such peaks
were common before, they were usually cleared very quickly. however,
this situation has not cleared since. in fact every day, some 100-200k
extra rados objects are added to the gc queue.

after a bit of investigation it turned out that many of the objects in
the gc queue were already garbage collected. i.e. rgw has deleted them
from the rados rgw data pool, but has failed to remove them from the gc
queue.

the logs (debug_rgw = 20) do not show anything unusual. deletes
succeed. even when deleting an already deleted rgw object (i.e. its
rados tail objects), there are no complaints in the log (even though
deletes of rados objects must fail as the objects are not present
anymore). however, even after n-th deletion, the objects are not
removed from the gc queue.

so, can someone help with the following:
- any pointers on where to start debugging this? I am at a loss since
rgws seems happy enough according to the logs.
- any ideas on how to remedy this situation? it will become a problem
in a week or two, according to the trends.

with regard to remedy in case we cannot diagnose the cause and fix it
soon enough, I was thinking about:
- stopping deletes to rgws for a short while,
- dumping the gc queue contents,
- stopping rgws,
- clearing or recreating the rgw gc queue structures on rados pools,
- restarting rgws and deletes,
- manually deleting the rados objects in the old gc queue dump.

is that a sound plan?

if so, what exactly does the "clearing or recreating the rgw gc queue
structures on rados pools" entail?

I am under the impression that the gc queue is stored in gc.<number>
objects in the GC namespace in the default.rgw.log pool. 

would just deleting these and starting rgw do the trick? or do I need
to somehow recreate empty objects in their stead? 

best regards,
  Jaka

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx