Re: Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

"J. Eric Ivancich" <ivancich@xxxxxxxxxx> · Fri, 22 Nov 2019 15:08:57 -0500

On 11/22/19 11:50 AM, David Monschein wrote:
> Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4.
> 
> We are running into what appears to be a serious bug that is affecting
> our fairly new object storage cluster. While investigating some
> performance issues -- seeing abnormally high IOPS, extremely slow bucket
> stat listings (over 3 minutes) -- we noticed some dynamic bucket
> resharding jobs running. Strangely enough they were resharding buckets
> that had very few objects. Even more worrying was the number of new
> shards Ceph was planning: 65521
> 
> [root@os1 ~]# radosgw-admin reshard list
> [
>     {
>         "time": "2019-11-22 00:12:40.192886Z",
>         "tenant": "",
>         "bucket_name": "redacteed",
>         "bucket_id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>         "new_instance_id":
> "redacted:c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7552496.28",
>         "old_num_shards": 1,
>         "new_num_shards": 65521
>     }
> ]
> 
> Upon further inspection we noticed a seemingly impossible number of
> objects (18446744073709551603) in rgw.none for the same bucket:
> [root@os1 ~]# radosgw-admin bucket stats --bucket=redacted
> {
>     "bucket": "redacted",
>     "tenant": "",
>     "zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
>     "placement_rule": "default-placement",
>     "explicit_placement": {
>         "data_pool": "",
>         "data_extra_pool": "",
>         "index_pool": ""
>     },
>     "id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "index_type": "Normal",
>     "owner": "d52cb8cc-1f92-47f5-86bf-fb28bc6b592c",
>     "ver": "0#12623",
>     "master_ver": "0#0",
>     "mtime": "2019-11-22 00:18:41.753188Z",
>     "max_marker": "0#",
>     "usage": {
>         "rgw.none": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 18446744073709551603
>         },
>         "rgw.main": {
>             "size": 63410030,
>             "size_actual": 63516672,
>             "size_utilized": 63410030,
>             "size_kb": 61924,
>             "size_kb_actual": 62028,
>             "size_kb_utilized": 61924,
>             "num_objects": 27
>         },
>         "rgw.multimeta": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 0
>         }
>     },
>     "bucket_quota": {
>         "enabled": false,
>         "check_on_raw": false,
>         "max_size": -1,
>         "max_size_kb": 0,
>         "max_objects": -1
>     }
> }
> 
> It would seem that the unreal number of objects in rgw.none is driving
> the resharding process, making ceph reshard the bucket 65521 times. I am
> assuming 65521 is the limit.
> 
> I have seen only a couple of references to this issue, none of which had
> a resolution or much of a conversation around them:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030791.html
> https://tracker.ceph.com/issues/37942
> 
> For now we are cancelling these resharding jobs since they seem to be
> causing performance issues with the cluster, but this is an untenable
> solution. Does anyone know what is causing this? Or how to prevent
> it/fix it?

2^64 (2 to the 64th power) is 18446744073709551616, which is 13 greater
than your value of 18446744073709551603. So this likely represents the
value of -13, but displayed in an unsigned format.

Obviously is should not calculate a value of -13. I'm guessing it's a
bug when bucket index entries that are categorized as rgw.none are
found, we're not adding to the stats, but when they're removed they are
being subtracted from the stats.

Interestingly resharding recalculates these, so you'll likely have a
much smaller value when you're done.

It seems the operations that result in rgw.none bucket index entries are
cancelled operations and removals.

We're currently looking at how best to deal with rgw.none stats here:

    https://github.com/ceph/ceph/pull/29062

Eric

-- 
J. Eric Ivancich
he/him/his
Red Hat Storage
Ann Arbor, Michigan, USA

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com