RGW Lifecycle Problem (Reef)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Reef 18.2.2, package install on Centos 9.

This is a very straightforward production cluster, 2 RGW hosts, no multisite. 4 buckets have lifecycle policies:

$ radosgw-admin lc list
[
    {
        "bucket": ":aaaaa:1d84db34-ed2b-400a-842e-344cdfa3deed.261076466.1",
        "shard": "lc.3",
        "started": "Mon, 15 Jul 2024 00:00:00 GMT",
        "status": "COMPLETE"
    },
    {
        "bucket": ":bbbbb:1d84db34-ed2b-400a-842e-344cdfa3deed.304826873.1",
        "shard": "lc.3",
        "started": "Mon, 15 Jul 2024 00:00:00 GMT",
        "status": "COMPLETE"
    },
    {
        "bucket": ":cccc:1d84db34-ed2b-400a-842e-344cdfa3deed.828249990.3",
        "shard": "lc.29",
        "started": "Thu, 01 Jan 1970 00:00:00 GMT",
        "status": "COMPLETE"
    },
    {
        "bucket": ":dddd:1d84db34-ed2b-400a-842e-344cdfa3deed.547988508.2",
        "shard": "lc.29",
        "started": "Thu, 01 Jan 1970 00:00:00 GMT",
        "status": "UNINITIAL"
    }
]

Symptoms are:
- aaaaa & bbbbb work & list above as expected
- ccccc works, but the "started" date is never updated
- ddddd never fires, unless run manually (the listing above is 1 week after the policy was removed and cleanly reapplied, but with no manual run)

The first three buckets are small, but ddddd is larger (6.3 TB, 3.2 M RGW objects).

The policy for ddddd transitions objects to a different storage class. However this never happens unless I run the lifecycle manually. Then it works for that manual run, but never fires automatically each night. IIRC after a manual run the status updates to something else but I can't remember what, but I think the started date remains unset.

Each night the following log entries can be seen in the ceph-client.rgw log on both RGW hosts:

2024-07-15T00:00:00.162+0000 7f1e44715640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.11 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:00.163+0000 7f1e44715640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:00.248+0000 7f1e46719640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.18 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:00.248+0000 7f1e46719640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:00.312+0000 7f1e46719640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.3 head last stored at Mon Jul 15 00:00:00 2024 2024-07-15T00:00:00.386+0000 7f1e42711640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.3 head last stored at Mon Jul 15 00:00:00 2024 2024-07-15T00:00:00.399+0000 7f1e42711640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.18 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:00.399+0000 7f1e42711640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:00.420+0000 7f1e42711640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.11 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:00.421+0000 7f1e42711640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:00.435+0000 7f1e42711640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.29 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:00.435+0000 7f1e42711640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:02.382+0000 7f1e46719640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.11 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:02.382+0000 7f1e46719640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:02.392+0000 7f1e46719640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.29 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:02.392+0000 7f1e46719640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:02.718+0000 7f1e44715640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.3 head last stored at Mon Jul 15 00:00:00 2024 2024-07-15T00:00:02.718+0000 7f1e44715640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:02.728+0000 7f1e44715640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.29 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:02.728+0000 7f1e44715640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2 2024-07-15T00:00:02.791+0000 7f1e44715640  0 lifecycle: RGWLC::process() head.marker !empty() at START for shard==lc.18 head last stored at Mon Nov 13 00:00:00 2023 2024-07-15T00:00:02.791+0000 7f1e44715640  0 lifecycle: RGWLC::process() sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2

In case it is relevant: for the RGW-related pools, index is R3, data is EC2+2 and the pool for the storage class we've moving objects to is EC4+2.

Any ideas how to proceed with diagnosis/fix?

Thanks, Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux