hi, pritha, On Wed, 2021-11-24 at 16:41 +0530, Pritha Srivastava wrote: > On Wed, Nov 24, 2021 at 4:11 PM Jaka Močnik <jaka@xxxxxxxxx> wrote: [...] > > after a bit of investigation it turned out that many of the objects > > in > > the gc queue were already garbage collected. i.e. rgw has deleted > > them > > from the rados rgw data pool, but has failed to remove them from > > the gc > > queue. > > > > How did you diagnose this? by means dumping the gc queue via radosgw-admin gc list --include all, checking the logs that objects still in that list were being deleted before the dump, and looking at the rgw logs. here is an example for one such rados object: ---- rados object 23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1 has been deleted at least once. logs: 2021-11-23T14:54:00.061+0100 7f6afa7fc700 20 garbage collection: RGWGC::process iterating over entry tag='23d143e2-d02d-4481-ba81-e783696ec99f.93072205.26537934^@', time=2021-11-21T12:01:08.225897+0100, chain.objs.size()=3 2021-11-23T14:54:00.061+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1 2021-11-23T14:54:00.753+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_2 2021-11-23T14:54:00.753+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_3 object indeed does not exist in the data pool anymore: # rados -p default.rgw.buckets.data get 23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1 out.bin error getting default.rgw.buckets.data/23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1: (2) No such file or directory however, it is still present in gc queue list made after time of deletion { "tag": "23d143e2-d02d-4481-ba81-e783696ec99f.93072205.26537934\u0000", "time": "2021-11-21T12:01:08.225897+0100", "objs": [ { "pool": "default.rgw.buckets.data", "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1", "key": "", "instance": "" }, { "pool": "default.rgw.buckets.data", "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_2", "key": "", "instance": "" }, { "pool": "default.rgw.buckets.data", "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_3", "key": "", "instance": "" } ] }, ---- [...] > Have you tried running radosgw-admin gc list command? Are some > entries always there, past their expiration time? There is a flag -- > include-all which can also be used to list all expired and unexpired > entries. yes. there are objects there for each day since 10. 11. 2021. some of them are getting deleted over and over again, and remain in the gc queue. the number of objects remaining from each day does not ever change, so I'm thinking it's a problem only with some of the gc "shards." > Also in the logs - do you see this "RGWGC::process removing entries, > marker: "? Are the markers getting repeated? no, the markers are not repeating right after each other. can't really tell in the long run. here is a grep for "marker" from today's logs of one of the rgws. ---- 2021-11-24T07:06:50.647+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T07:06:51.515+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:06:52.115+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T07:06:52.411+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:06:54.687+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/21854656' 2021-11-24T07:24:33.287+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T07:24:33.479+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:24:34.135+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T07:24:34.483+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:24:34.831+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T07:24:35.279+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:24:35.591+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker='' 2021-11-24T07:24:35.775+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:24:36.267+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker='' 2021-11-24T07:24:40.803+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/23111053' 2021-11-24T07:45:17.850+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T07:45:18.366+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:45:19.086+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T07:45:19.474+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:45:20.018+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker='' 2021-11-24T07:45:20.426+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T07:45:22.082+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:45:22.554+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T07:45:23.310+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T07:45:27.902+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/22157454' 2021-11-24T08:05:11.768+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=8, truncated=0, next_marker='' 2021-11-24T08:05:12.896+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T08:05:13.056+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T08:05:13.380+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T08:05:13.624+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=8, truncated=0, next_marker='' 2021-11-24T08:05:14.236+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T08:05:21.392+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/29926681' 2021-11-24T08:47:37.722+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker='' 2021-11-24T08:47:45.474+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=49, truncated=0, next_marker='' 2021-11-24T09:32:49.386+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=11, truncated=0, next_marker='' 2021-11-24T09:33:03.414+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T09:33:03.662+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=10, truncated=0, next_marker='' 2021-11-24T09:33:05.442+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T09:33:11.874+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/30714984' 2021-11-24T10:17:46.603+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=5, truncated=0, next_marker='' 2021-11-24T10:17:53.991+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:17:54.279+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T10:17:58.087+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:17:58.827+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=5, truncated=0, next_marker='' 2021-11-24T10:18:00.771+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:18:03.363+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/20026975' 2021-11-24T10:24:25.192+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:24:25.500+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:24:31.121+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=54, truncated=0, next_marker='' 2021-11-24T10:49:31.582+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T10:49:33.070+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:33.318+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T10:49:36.294+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:36.694+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=7, truncated=0, next_marker='' 2021-11-24T10:49:42.014+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:42.690+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:43.050+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:43.718+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:45.994+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:46.262+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker='' 2021-11-24T10:49:47.154+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:47.634+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:49.766+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:50.318+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker='' 2021-11-24T10:49:50.614+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker='' 2021-11-24T10:49:52.006+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:52.422+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:54.371+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:54.647+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:55.795+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:56.227+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:56.731+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:57.035+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T10:49:58.491+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:49:58.699+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker='' 2021-11-24T10:50:00.395+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:50:00.711+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker='' 2021-11-24T10:50:03.099+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T10:50:07.855+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/26269608' 2021-11-24T11:21:04.061+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T11:21:12.849+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T11:21:13.385+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker='' 2021-11-24T11:21:16.573+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T11:21:17.261+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker='' 2021-11-24T11:21:18.981+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T11:21:23.617+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/21488979' 2021-11-24T11:50:19.050+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T11:50:23.438+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T11:50:24.554+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker='' 2021-11-24T11:50:25.798+0100 7f6afa7fc700 5 garbage collection: RGWGC::process removing entries, marker: 2021-11-24T11:50:27.731+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=18, truncated=0, next_marker='' ---- > > with regard to remedy in case we cannot diagnose the cause and fix > > it > > soon enough, I was thinking about: > > - stopping deletes to rgws for a short while, > > - dumping the gc queue contents, > > - stopping rgws, > > - clearing or recreating the rgw gc queue structures on rados > > pools, > > - restarting rgws and deletes, > > - manually deleting the rados objects in the old gc queue dump. > > > > is that a sound plan? > > > > if so, what exactly does the "clearing or recreating the rgw gc > > queue > > structures on rados pools" entail? > > > > I am under the impression that the gc queue is stored in > > gc.<number> > > objects in the GC namespace in the default.rgw.log pool. > > > > would just deleting these and starting rgw do the trick? or do I > > need > > to somehow recreate empty objects in their stead? > > > > > > Have you tried using the command: radosgw-admin gc process, to clear > the expired entries and with --include-all to clear all entries? yes. it finishes (takes ~12h). the deletes get run according to logs. the problematic objects still remain in the gc queue. I should perhaps note that not all objects exhibit this problem. some (in my estimation ~80%) get removed just fine. but the ones that don't are getting deleted over and over again but are never removed. fwiw, these are the rgw gc settings that we explicitly set (others should be default). note that there was no problem with these settings for a long time, on either nautilus or octopus: ---- rgw_gc_max_objs = 128 rgw_gc_obj_min_wait = 3600 rgw_gc_processor_max_time = 300 rgw_gc_processor_period = 300 ---- regards, Jaka _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx