Bucket index logs (bilogs) not being trimmed automatically (multisite, ceph nautilus 14.2.9)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

We're seeing a problem in our multisite Ceph deployment, where bilogs aren't being trimmed for several buckets. This is causing bilogs to accumulate over time, leading to large OMAP object warnings for the indexes on these buckets.

In every case, Ceph reports that the bucket is in sync and the data is consistent across both sites, so we're perplexed as to why the logs aren't being trimmed. It's not affecting all of our buckets, we're not sure what's 'different' in the affected cases causing them to accumulate. We're seeing this in both unsharded and sharded buckets. Some buckets with heavy activity (lots of object updates) have accumulated millions of bilogs, but this does not affect all of our very active buckets.

I've tried running 'radosgw-admin bilog autotrim' against an affected bucket, and it doesn't appear to do anything. I've used 'radosgw-admin bilog trim' with a suitable 'end-marker' to trim all of the bilogs, but the implications of doing this aren't clear to me, and the logs continue to accumulate afterwards.

We're running Ceph Nautilus 14.2.9 and running the services in containers, we have 3 hosts on each site with 1 OSD on each host.

We're adjusting a fairly minimal set of config and I don't think it includes anything that would affect bilog trimming. Checking the running config on the mon service, I think these defaulted parameters are relevant:

    "rgw_sync_log_trim_concurrent_buckets": "4",
    "rgw_sync_log_trim_interval": "1200",
    "rgw_sync_log_trim_max_buckets": "16",
    "rgw_sync_log_trim_min_cold_buckets": "4",

I can't find any documentation on these parameters but we have more than 16 buckets, so is it possible that some buckets are just never being selected for trimming?

Any other ideas as to what might be causing this, or anything else we could try to help diagnose or fix this? Thanks in advance!


I've included an example below for one such affected bucket, showing its current state. Zone details (as per 'radosgw-admin zonegroup get') are at the bottom.

$ radosgw-admin bucket sync status --bucket=edin2z6-sharedconfig
          realm b7f31089-0879-4fa2-9cbc-cfdf5f866a35 (geored_realm)
      zonegroup 5d74eb0e-5d99-481f-ae33-43483f6cebc0 (geored_zg)
           zone c48f33ad-6d79-4b9f-a22f-78589f67526e (siteA)
         bucket edin2z6-sharedconfig[033709fc-924a-4582-b00d-97c90e9e61b6.3634407.1]

    source zone 0a3c29b7-1a2c-432d-979b-d324a05cc831 (siteApubsub)
                full sync: 0/1 shards
                incremental sync: 0/1 shards
                bucket is caught up with source
    source zone 9f5fba56-4a32-46a6-8695-89253be81614 (siteB)
                full sync: 0/1 shards
                incremental sync: 1/1 shards
                bucket is caught up with source
    source zone c72b3aa8-a051-4665-9421-909510702412 (siteBpubsub)
                full sync: 0/1 shards
                incremental sync: 0/1 shards
                bucket is caught up with source

$ radosgw-admin bilog list --bucket edin2z6-sharedconfig --max-entries 600000000 | grep op_id | wc -l
1299392

$ rados -p siteA.rgw.buckets.index listomapkeys .dir.033709fc-924a-4582-b00d-97c90e9e61b6.3634407.1 | wc -l
1299083



$ radosgw-admin bucket stats --bucket=edin2z6-sharedconfig
{
    "bucket": "edin2z6-sharedconfig",
    "num_shards": 0,
    "tenant": "",
    "zonegroup": "5d74eb0e-5d99-481f-ae33-43483f6cebc0",       
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "",
        "data_extra_pool": "",
        "index_pool": ""
    },
    "id": "033709fc-924a-4582-b00d-97c90e9e61b6.3634407.1",    
    "marker": "033709fc-924a-4582-b00d-97c90e9e61b6.3634407.1",
    "index_type": "Normal",
    "owner": "edin2z6",
    "ver": "0#1622676",
    "master_ver": "0#0",
    "mtime": "2020-01-14 14:30:18.606142Z",
    "max_marker": "0#00001622675.2115836.5",
    "usage": {
        "rgw.main": {
            "size": 15209,
            "size_actual": 40960,
            "size_utilized": 15209,
            "size_kb": 15,
            "size_kb_actual": 40,
            "size_kb_utilized": 15,
            "num_objects": 7
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

$ radosgw-admin bucket limit check
...
            {
                "bucket": "edin2z6-sharedconfig",
                "tenant": "",
                "num_objects": 7,
                "num_shards": 0,
                "objects_per_shard": 7,
                "fill_status": "OK"
            },
...

$ radosgw-admin zonegroup get
 ++ sudo docker ps --filter name=ceph-rgw-.*rgw -q
 ++  sudo docker exec d2c999b1f3f8 radosgw-admin
{
    "id": "5d74eb0e-5d99-481f-ae33-43483f6cebc0",
    "name": "geored_zg",
    "api_name": "geored_zg",
    "is_master": "true",
    "endpoints": [
        "https://10.254.2.93:7480";
    ],
    "hostnames": [],
    "hostnames_s3website": [],
    "master_zone": "c48f33ad-6d79-4b9f-a22f-78589f67526e",
    "zones": [
        {
            "id": "0a3c29b7-1a2c-432d-979b-d324a05cc831",
            "name": "siteApubsub",
            "endpoints": [
                "https://10.254.2.93:7481";,
                "https://10.254.2.94:7481";,
                "https://10.254.2.95:7481";
            ],
            "log_meta": "false",
            "log_data": "true",
            "bucket_index_max_shards": 0,
            "read_only": "false",
            "tier_type": "pubsub",
            "sync_from_all": "false",
            "sync_from": [
                "siteA"
            ],
            "redirect_zone": ""
        },
        {
            "id": "9f5fba56-4a32-46a6-8695-89253be81614",
            "name": "siteB",
            "endpoints": [
                "https://10.254.2.224:7480";,
                "https://10.254.2.225:7480";,
                "https://10.254.2.226:7480";
            ],
            "log_meta": "false",
            "log_data": "true",
            "bucket_index_max_shards": 0,
            "read_only": "false",
            "tier_type": "",
            "sync_from_all": "true",
            "sync_from": [],
            "redirect_zone": ""
        },
        {
            "id": "c48f33ad-6d79-4b9f-a22f-78589f67526e",
            "name": "siteA",
            "endpoints": [
                "https://10.254.2.93:7480";,
                "https://10.254.2.94:7480";,
                "https://10.254.2.95:7480";
            ],
            "log_meta": "false",
            "log_data": "true",
            "bucket_index_max_shards": 0,
            "read_only": "false",
            "tier_type": "",
            "sync_from_all": "true",
            "sync_from": [],
            "redirect_zone": ""
        },
        {
            "id": "c72b3aa8-a051-4665-9421-909510702412",
            "name": "siteBpubsub",
            "endpoints": [
                "https://10.254.2.224:7481";,
                "https://10.254.2.225:7481";,
                "https://10.254.2.226:7481";
            ],
            "log_meta": "false",
            "log_data": "true",
            "bucket_index_max_shards": 0,
            "read_only": "false",
            "tier_type": "pubsub",
            "sync_from_all": "false",
            "sync_from": [
                "siteB"
            ],
            "redirect_zone": ""
        }
    ],
    "placement_targets": [
        {
            "name": "default-placement",
            "tags": [],
            "storage_classes": [
                "STANDARD"
            ]
        }
    ],
    "default_placement": "default-placement",
    "realm_id": "b7f31089-0879-4fa2-9cbc-cfdf5f866a35"
}
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux