Re: GCed (as in tail objects already deleted from the data pool) objects remain in the GC queue forever

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi, pritha,

On Wed, 2021-11-24 at 16:41 +0530, Pritha Srivastava wrote:
> On Wed, Nov 24, 2021 at 4:11 PM Jaka Močnik <jaka@xxxxxxxxx> wrote:
[...]
> > after a bit of investigation it turned out that many of the objects
> > in
> > the gc queue were already garbage collected. i.e. rgw has deleted
> > them
> > from the rados rgw data pool, but has failed to remove them from
> > the gc
> > queue.
> > 
> 
> How did you diagnose this?
by means dumping the gc queue via radosgw-admin gc list --include all,
checking the logs that objects still in that list were being deleted
before the dump, and looking at the rgw logs.

here is an example for one such rados object:

----
rados object

23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1

has been deleted at least once. logs:

2021-11-23T14:54:00.061+0100 7f6afa7fc700 20 garbage collection: RGWGC::process iterating over entry tag='23d143e2-d02d-4481-ba81-e783696ec99f.93072205.26537934^@', time=2021-11-21T12:01:08.225897+0100, chain.objs.size()=3
2021-11-23T14:54:00.061+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1
2021-11-23T14:54:00.753+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_2
2021-11-23T14:54:00.753+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing default.rgw.buckets.data:23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_3

object indeed does not exist in the data pool anymore:

# rados -p default.rgw.buckets.data get 23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1 out.bin
error getting default.rgw.buckets.data/23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1: (2) No such file or directory

however, it is still present in gc queue list made after time of deletion

    {   
        "tag": "23d143e2-d02d-4481-ba81-e783696ec99f.93072205.26537934\u0000",
        "time": "2021-11-21T12:01:08.225897+0100",
        "objs": [
            {   
                "pool": "default.rgw.buckets.data",
                "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_1",
                "key": "",
                "instance": ""
            },
            {   
                "pool": "default.rgw.buckets.data",
                "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_2",
                "key": "",
                "instance": ""
            },
            {   
                "pool": "default.rgw.buckets.data",
                "oid": "23d143e2-d02d-4481-ba81-e783696ec99f.43219778.5048__shadow_.fK9K7WI3BhIiUbDXoS5UAmcpYqmShR5_3",
                "key": "",
                "instance": ""
            }
        ]
    },
----

[...]
> Have you tried running radosgw-admin gc list command? Are some
> entries always there, past their expiration time? There is a flag --
> include-all which can also be used to list all expired and unexpired
> entries.
yes. there are objects there for each day since 10. 11. 2021. some of
them are getting deleted over and over again, and remain in the gc
queue. the number of objects remaining from each day does not ever
change, so I'm thinking it's a problem only with some of the gc
"shards."

> Also in the logs - do you see this "RGWGC::process removing entries,
> marker: "? Are the markers getting repeated?
no, the markers are not repeating right after each other. can't really
tell in the long run. here is a grep for "marker" from today's logs of
one of the rgws.

----
2021-11-24T07:06:50.647+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T07:06:51.515+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:06:52.115+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T07:06:52.411+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:06:54.687+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/21854656'
2021-11-24T07:24:33.287+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T07:24:33.479+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:24:34.135+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T07:24:34.483+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:24:34.831+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T07:24:35.279+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:24:35.591+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker=''
2021-11-24T07:24:35.775+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:24:36.267+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker=''
2021-11-24T07:24:40.803+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/23111053'
2021-11-24T07:45:17.850+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T07:45:18.366+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:45:19.086+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T07:45:19.474+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:45:20.018+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker=''
2021-11-24T07:45:20.426+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T07:45:22.082+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:45:22.554+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T07:45:23.310+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T07:45:27.902+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/22157454'
2021-11-24T08:05:11.768+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=8, truncated=0, next_marker=''
2021-11-24T08:05:12.896+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T08:05:13.056+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T08:05:13.380+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T08:05:13.624+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=8, truncated=0, next_marker=''
2021-11-24T08:05:14.236+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T08:05:21.392+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/29926681'
2021-11-24T08:47:37.722+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker=''
2021-11-24T08:47:45.474+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=49, truncated=0, next_marker=''
2021-11-24T09:32:49.386+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=11, truncated=0, next_marker=''
2021-11-24T09:33:03.414+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T09:33:03.662+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=10, truncated=0, next_marker=''
2021-11-24T09:33:05.442+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T09:33:11.874+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/30714984'
2021-11-24T10:17:46.603+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=5, truncated=0, next_marker=''
2021-11-24T10:17:53.991+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:17:54.279+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T10:17:58.087+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:17:58.827+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=5, truncated=0, next_marker=''
2021-11-24T10:18:00.771+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:18:03.363+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/20026975'
2021-11-24T10:24:25.192+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:24:25.500+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:24:31.121+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=54, truncated=0, next_marker=''
2021-11-24T10:49:31.582+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T10:49:33.070+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:33.318+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T10:49:36.294+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:36.694+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=7, truncated=0, next_marker=''
2021-11-24T10:49:42.014+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:42.690+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:43.050+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:43.718+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:45.994+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:46.262+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker=''
2021-11-24T10:49:47.154+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:47.634+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:49.766+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:50.318+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker=''
2021-11-24T10:49:50.614+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=4, truncated=0, next_marker=''
2021-11-24T10:49:52.006+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:52.422+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:54.371+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:54.647+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:55.795+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:56.227+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:56.731+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:57.035+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T10:49:58.491+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:49:58.699+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker=''
2021-11-24T10:50:00.395+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:50:00.711+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker=''
2021-11-24T10:50:03.099+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T10:50:07.855+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/26269608'
2021-11-24T11:21:04.061+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T11:21:12.849+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T11:21:13.385+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=3, truncated=0, next_marker=''
2021-11-24T11:21:16.573+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T11:21:17.261+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=1, truncated=0, next_marker=''
2021-11-24T11:21:18.981+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T11:21:23.617+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=100, truncated=1, next_marker='0/21488979'
2021-11-24T11:50:19.050+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T11:50:23.438+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T11:50:24.554+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=2, truncated=0, next_marker=''
2021-11-24T11:50:25.798+0100 7f6afa7fc700  5 garbage collection: RGWGC::process removing entries, marker: 
2021-11-24T11:50:27.731+0100 7f6afa7fc700 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=18, truncated=0, next_marker=''
----

> > with regard to remedy in case we cannot diagnose the cause and fix
> > it
> > soon enough, I was thinking about:
> > - stopping deletes to rgws for a short while,
> > - dumping the gc queue contents,
> > - stopping rgws,
> > - clearing or recreating the rgw gc queue structures on rados
> > pools,
> > - restarting rgws and deletes,
> > - manually deleting the rados objects in the old gc queue dump.
> > 
> > is that a sound plan?
> > 
> > if so, what exactly does the "clearing or recreating the rgw gc
> > queue
> > structures on rados pools" entail?
> > 
> > I am under the impression that the gc queue is stored in
> > gc.<number>
> > objects in the GC namespace in the default.rgw.log pool. 
> > 
> > would just deleting these and starting rgw do the trick? or do I
> > need
> > to somehow recreate empty objects in their stead? 
> > 
> > 
> 
> Have you tried using the command: radosgw-admin gc process, to clear
> the expired entries and with --include-all to clear all entries? 
yes. it finishes (takes ~12h). the deletes get run according to logs.
the problematic objects still remain in the gc queue.

I should perhaps note that not all objects exhibit this problem. some
(in my estimation ~80%) get removed just fine. but the ones that don't
are getting deleted over and over again but are never removed.

fwiw, these are the rgw gc settings that we explicitly set (others
should be default). note that there was no problem with these settings
for a long time, on either nautilus or octopus:
----
rgw_gc_max_objs = 128
rgw_gc_obj_min_wait = 3600
rgw_gc_processor_max_time = 300
rgw_gc_processor_period = 300
----

regards,
  Jaka

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux