rgw bucket inaccessible - appears to be using incorrect index pool?

Graham Allan <gta@xxxxxxx> · Fri, 16 Feb 2018 19:06:21 -0600

Sorry to be posting a second mystery at the same time - though this 
feels unconnected to my other one.

We had a user complain that they can't list the contents of one of their 
buckets (they can access certain objects within the bucket).

I started by running a simple command to get data on the bucket:

root@cephmon1:~# radosgw-admin bucket stats --bucket=mccuelab
error getting bucket stats ret=-2

not encouraging, but struck a memory... we had a bucket some time ago 
which had 32 index shards but the metadata showed num_shards=0... gave 
the same error. So looking at bucket metadata...

root@cephmon1:~# radosgw-admin metadata get bucket.instance:mccuelab:default.2049236.2
{
    "key": "bucket.instance:mccuelab:default.2049236.2",
    "ver": {
        "tag": "_pOR6OLmXKQxYuFBa0E-eEmK",
        "ver": 17
    },
    "mtime": "2018-02-15 17:50:28.225135Z",
    "data": {
        "bucket_info": {
            "bucket": {
                "name": "mccuelab",
                "marker": "default.2049236.2",
                "bucket_id": "default.2049236.2",
                "tenant": "",
                "explicit_placement": {
                    "data_pool": ".rgw.buckets",
                    "data_extra_pool": "",
                    "index_pool": ".rgw.buckets"
                }
            },
            "creation_time": "0.000000",
            "owner": "uid=12093",
            "flags": 0,
            "zonegroup": "default",
            "placement_rule": "",
            "has_instance_obj": "true",
            "quota": {
                "enabled": false,
                "check_on_raw": false,
                "max_size": -1024,
                "max_size_kb": 0,
                "max_objects": -1
            },
            "num_shards": 32,
            "bi_shard_hash_type": 0,
            "requester_pays": "false",
            "has_website": "false",
            "swift_versioning": "false",
            "swift_ver_location": "",
            "index_type": 0,
            "mdsearch_config": [],
            "reshard_status": 0,
            "new_bucket_instance_id": ""
        },
        "attrs": [
            {
                "key": "user.rgw.acl",
                "val": "AgKxAAAAAwImAAAACQAAAHVpZD0xMjA5MxUAAABSb2JlcnQgSmFtZXMgU2NoYWVmZXIEA38AAAABAQAAAAkAAAB1aWQ9MTIwOTMPAAAAAQAAAAkAAAB1aWQ9MTIwOTMFA0oAAAACAgQAAAAAAAAACQAAAHVpZD0xMjA5MwAAAAAAAAAAAgIEAAAADwAAABUAAABSb2JlcnQgSmFtZXMgU2NoYWVmZXIAAAAAAAAAAAAAAAAAAAAA"
            },
            {
                "key": "user.rgw.idtag",
                "val": ""
            }
        ]
    }
}

But, the index pool doesn't contain all of these - only 15 shards:

root@cephmon1:~# rados -p .rgw.buckets.index ls - | grep "default.2049236.2"
.dir.default.2049236.2.22
.dir.default.2049236.2.3
.dir.default.2049236.2.10
.dir.default.2049236.2.31
.dir.default.2049236.2.12
.dir.default.2049236.2.0
.dir.default.2049236.2.18
.dir.default.2049236.2.13
.dir.default.2049236.2.16
.dir.default.2049236.2.11
.dir.default.2049236.2.23
.dir.default.2049236.2.17
.dir.default.2049236.2.9
.dir.default.2049236.2.29
.dir.default.2049236.2.24

But, wait a minute - I wasn't reading carefully. this is a really old 
bucket... the data pool and index pool are both set to .rgw.buckets

OK, so I check for index objects in the .rgw.buckets pool, where shards 
0..31 are all present - that's good.

so why do any index objects even exist in the .rgw.buckets.index pool...?

Set debug rgw=1 debug ms=1 and ran "radosgw-admin bi list"... amongst a 
lot of other output I see...
- a query to osd 204 pg 100.1c;
- it finds and lists the first entry from ".dir.default.2049236.2.0"
- then a query to osd 164 pg 100.3d
- which returns "file not found" for ".dir.default.2049236.2.1"...
- consistent with shard #0 existing in .rgw.buckets.index, but not #1.

2018-02-16 18:13:10.405545 7f0f539beb80  1 -- 10.32.16.93:0/3172453804 --> 10.31.0.65:6812/58901 -- osd_op(unknown.0.0:97 100.1c 100:3a18c885:::.dir.default.2049236.2.0:head [call rgw.bi_list] snapc 0=[] ondisk+read+known_if_redirected e507701) v8 -- 0x7f0f55921f90 con 0
2018-02-16 18:13:10.410902 7f0f40a2f700  1 -- 10.32.16.93:0/3172453804 <== osd.204 10.31.0.65:6812/58901 1 ==== osd_op_reply(97 .dir.default.2049236.2.0 [call] v0'0 uv1416558 ondisk = 0) v8 ==== 168+0+317 (1847036665 0 2928341486) 0x7f0f3403d510 con 0x7f0f55925720[
    {
        "type": "plain",
        "idx": "durwa004/Copenhagen_bam_files_3.tar.xz",
        "entry": {
            "name": "durwa004/Copenhagen_bam_files_3.tar.xz",
            "instance": "",
            "ver": {
                "pool": 23,
                "epoch": 179629
            },
            "locator": "",
            "exists": "true",
            "meta": {
                "category": 1,
                "size": 291210535540,
                "mtime": "2018-02-09 04:59:43.869899Z",
                "etag": "e75dc95f44944fe9df6a102c809566be-272",
                "owner": "uid=12093",
                "owner_display_name": "Robert James Schaefer",
                "content_type": "application/x-xz",
                "accounted_size": 291210535540,
                "user_data": ""
            },
            "tag": "default.8366086.124333",
            "flags": 0,
            "pending_map": [],
            "versioned_epoch": 0
        }
    }
2018-02-16 18:13:10.411748 7f0f539beb80  1 -- 10.32.16.93:0/3172453804 --> 10.31.0.71:6816/59857 -- osd_op(unknown.0.0:98 100.3d 100:bd22ae7d:::.dir.default.2049236.2.1:head [call rgw.bi_list] snapc 0=[] ondisk+read+known_if_redirected e507701) v8 -- 0x7f0f55924b40 con 0
ERROR: bi_list(): (2) No such file or directory
2018-02-16 18:13:10.414018 7f0f41a31700  1 -- 10.32.16.93:0/3172453804 <== osd.164 10.31.0.71:6816/59857 1 ==== osd_op_reply(98 .dir.default.2049236.2.1 [call] v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v8 ==== 168+0+0 (600468650 0 0) 0x7f0f3c03d540 con 0x7f0f5592ad50

It seems to me rgw should be looking in pool 23 (.rgw.buckets), not pool 
100 (.rgw.buckets.index)?

Presumably at some point it started using the default index pool (incl. 
adding new entries?) as defined in the rgw zone, rather than those 
defined for explicit_placement in the bucket metadata.

Last time we had "issues" of this sort (such as the incorrect 
num_shards) was a long time ago; associated to the hammer->jewel upgrade 
circa 11/2016. I find it hard to believe this is being going on 
unnoticed for that long. Maybe related to our jewel->luminous upgrade 
(12/2017)? I'll ask the user when they last listed the bucket successfully.

Here is the current zone, btw - I do also have a json dump of this from 
11/2016 and it seems largely unchanged (some things like lc_pool and 
reshard_pool didn't exist).

root@cephmon1:~# radosgw-admin zone get default
{
    "id": "default",
    "name": "default",
    "domain_root": ".rgw",
    "control_pool": ".rgw.control",
    "gc_pool": ".rgw.gc",
    "lc_pool": ".log:lc",
    "log_pool": ".log",
    "intent_log_pool": ".intent-log",
    "usage_log_pool": ".usage",
    "reshard_pool": ".log:reshard",
    "user_keys_pool": ".users",
    "user_email_pool": ".users.email",
    "user_swift_pool": ".users.swift",
    "user_uid_pool": ".users.uid",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": ".rgw.buckets.index",
                "data_pool": ".rgw.buckets",
                "data_extra_pool": ".rgw.buckets.extra",
                "index_type": 0,
                "compression": ""
            }
        },
        {
            "key": "ec42-placement",
            "val": {
                "index_pool": ".rgw.buckets.index",
                "data_pool": ".rgw.buckets.ec42",
                "data_extra_pool": ".rgw.buckets.extra",
                "index_type": 0,
                "compression": ""
            }
        }
    ],
    "metadata_heap": ".rgw.meta",
    "tier_config": [],
    "realm_id": "dbfd45d9-e250-41b0-be3e-ab9430215d5b"
}

Graham
--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com