Reef 18.2.2, package install on Centos 9.
This is a very straightforward production cluster, 2 RGW hosts, no
multisite. 4 buckets have lifecycle policies:
$ radosgw-admin lc list
[
{
"bucket":
":aaaaa:1d84db34-ed2b-400a-842e-344cdfa3deed.261076466.1",
"shard": "lc.3",
"started": "Mon, 15 Jul 2024 00:00:00 GMT",
"status": "COMPLETE"
},
{
"bucket":
":bbbbb:1d84db34-ed2b-400a-842e-344cdfa3deed.304826873.1",
"shard": "lc.3",
"started": "Mon, 15 Jul 2024 00:00:00 GMT",
"status": "COMPLETE"
},
{
"bucket": ":cccc:1d84db34-ed2b-400a-842e-344cdfa3deed.828249990.3",
"shard": "lc.29",
"started": "Thu, 01 Jan 1970 00:00:00 GMT",
"status": "COMPLETE"
},
{
"bucket": ":dddd:1d84db34-ed2b-400a-842e-344cdfa3deed.547988508.2",
"shard": "lc.29",
"started": "Thu, 01 Jan 1970 00:00:00 GMT",
"status": "UNINITIAL"
}
]
Symptoms are:
- aaaaa & bbbbb work & list above as expected
- ccccc works, but the "started" date is never updated
- ddddd never fires, unless run manually (the listing above is 1 week
after the policy was removed and cleanly reapplied, but with no manual run)
The first three buckets are small, but ddddd is larger (6.3 TB, 3.2 M
RGW objects).
The policy for ddddd transitions objects to a different storage class.
However this never happens unless I run the lifecycle manually. Then it
works for that manual run, but never fires automatically each night.
IIRC after a manual run the status updates to something else but I can't
remember what, but I think the started date remains unset.
Each night the following log entries can be seen in the ceph-client.rgw
log on both RGW hosts:
2024-07-15T00:00:00.162+0000 7f1e44715640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.11 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:00.163+0000 7f1e44715640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:00.248+0000 7f1e46719640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.18 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:00.248+0000 7f1e46719640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:00.312+0000 7f1e46719640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.3 head last stored at Mon
Jul 15 00:00:00 2024
2024-07-15T00:00:00.386+0000 7f1e42711640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.3 head last stored at Mon
Jul 15 00:00:00 2024
2024-07-15T00:00:00.399+0000 7f1e42711640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.18 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:00.399+0000 7f1e42711640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:00.420+0000 7f1e42711640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.11 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:00.421+0000 7f1e42711640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:00.435+0000 7f1e42711640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.29 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:00.435+0000 7f1e42711640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:02.382+0000 7f1e46719640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.11 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:02.382+0000 7f1e46719640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:02.392+0000 7f1e46719640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.29 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:02.392+0000 7f1e46719640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:02.718+0000 7f1e44715640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.3 head last stored at Mon
Jul 15 00:00:00 2024
2024-07-15T00:00:02.718+0000 7f1e44715640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:02.728+0000 7f1e44715640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.29 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:02.728+0000 7f1e44715640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
2024-07-15T00:00:02.791+0000 7f1e44715640 0 lifecycle: RGWLC::process()
head.marker !empty() at START for shard==lc.18 head last stored at Mon
Nov 13 00:00:00 2023
2024-07-15T00:00:02.791+0000 7f1e44715640 0 lifecycle: RGWLC::process()
sal_lc->get_entry(lc_shard, head.marker, entry) returned error ret==-2
In case it is relevant: for the RGW-related pools, index is R3, data is
EC2+2 and the pool for the storage class we've moving objects to is EC4+2.
Any ideas how to proceed with diagnosis/fix?
Thanks, Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx