Re: OMAP warning ( again )

Brad Hubbard <bhubbard@xxxxxxxxxx> · Thu, 2 Aug 2018 08:38:34 +1000

rgw is not really my area but I'd suggest before you do *anything* you
establish which object it is talking about.

On Thu, Aug 2, 2018 at 8:08 AM, Brent Kennedy <bkennedy@xxxxxxxxxx> wrote:
> Ceph health detail gives this:
> HEALTH_WARN 1 large omap objects
> LARGE_OMAP_OBJECTS 1 large omap objects
>     1 large objects found in pool '.rgw.buckets.index'
>     Search the cluster log for 'Large omap object found' for more details.
>
> The ceph.log file on the monitor server only shows the 1 large omap objects message.
>
> I looked further into the issue again and remembered it was related to bucket sharding.  I then remembered that in Luminous it was supposed to dynamic. I went through the process this time of checking to see what the shards were set to for one of the buckets we have and the max shards is still set to 0.  The blog posting about it says that there isn’t anything we have to do, but I am wondering if the same is true for clusters that were upgraded to luminous from older versions.
>
> Do I need to run this: radosgw-admin reshard add --bucket=<bucket> --num-shards=<num_shards>  for every bucket to get that going?
>
> When I look at a bucket ( BKTEST ), it shows num_shards as 0:
> root@ukpixmon1:/var/log/ceph# radosgw-admin metadata get bucket.instance:BKTEST:default.7320.3
> {
>     "key": "bucket.instance:BKTEST:default.7320.3",
>     "ver": {
>         "tag": "_JFn84AijvH8aWXWXyvSeKpZ",
>         "ver": 1
>     },
>     "mtime": "2018-01-10 18:50:07.994194Z",
>     "data": {
>         "bucket_info": {
>             "bucket": {
>                 "name": "BKTEST",
>                 "marker": "default.7320.3",
>                 "bucket_id": "default.7320.3",
>                 "tenant": "",
>                 "explicit_placement": {
>                     "data_pool": ".rgw.buckets",
>                     "data_extra_pool": ".rgw.buckets.extra",
>                     "index_pool": ".rgw.buckets.index"
>                 }
>             },
>             "creation_time": "2016-03-09 17:23:50.000000Z",
>             "owner": "zzzzzzzzzz",
>             "flags": 0,
>             "zonegroup": "default",
>             "placement_rule": "default-placement",
>             "has_instance_obj": "true",
>             "quota": {
>                 "enabled": false,
>                 "check_on_raw": false,
>                 "max_size": -1024,
>                 "max_size_kb": 0,
>                 "max_objects": -1
>             },
>             "num_shards": 0,
>             "bi_shard_hash_type": 0,
>             "requester_pays": "false",
>             "has_website": "false",
>             "swift_versioning": "false",
>             "swift_ver_location": "",
>             "index_type": 0,
>             "mdsearch_config": [],
>             "reshard_status": 0,
>             "new_bucket_instance_id": ""
>
> When I run that shard setting to change the number of shards:
> "radosgw-admin reshard add --bucket=BKTEST --num-shards=2"
>
> Then run to get the status:
> "radosgw-admin reshard list"
>
> [
>     {
>         "time": "2018-08-01 21:58:13.306381Z",
>         "tenant": "",
>         "bucket_name": "BKTEST",
>         "bucket_id": "default.7320.3",
>         "new_instance_id": "",
>         "old_num_shards": 1,
>         "new_num_shards": 2
>     }
> ]
>
> If it was 0, why does it say old_num_shards was 1?
>
> -Brent
>
> -----Original Message-----
> From: Brad Hubbard [mailto:bhubbard@xxxxxxxxxx]
> Sent: Tuesday, July 31, 2018 9:07 PM
> To: Brent Kennedy <bkennedy@xxxxxxxxxx>
> Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx>
> Subject: Re:  OMAP warning ( again )
>
> Search the cluster log for 'Large omap object found' for more details.
>
> On Wed, Aug 1, 2018 at 3:50 AM, Brent Kennedy <bkennedy@xxxxxxxxxx> wrote:
>> Upgraded from 12.2.5 to 12.2.6, got a “1 large omap objects” warning
>> message, then upgraded to 12.2.7 and the message went away.  I just
>> added four OSDs to balance out the cluster ( we had some servers with
>> fewer drives in them; jbod config ) and now the “1 large omap objects”
>> warning message is back.  I did some googlefoo to try to figure out
>> what it means and then how to correct it, but the how to correct it is a bit vague.
>>
>>
>>
>> We use rados gateways for all storage, so everything is in the
>> .rgw.buckets pool, which I gather from research is why we are getting
>> the warning message ( there are millions of objects in there ).
>>
>>
>>
>> Is there an if/then process to clearing this error message?
>>
>>
>>
>> Regards,
>>
>> -Brent
>>
>>
>>
>> Existing Clusters:
>>
>> Test: Luminous 12.2.7 with 3 osd servers, 1 mon/man, 1 gateway ( all
>> virtual
>> )
>>
>> US Production: Firefly with 4 osd servers, 3 mons, 3 gateways behind
>> haproxy LB
>>
>> UK Production: Luminous 12.2.7 with 8 osd servers, 3 mons/man, 3
>> gateways behind haproxy LB
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Cheers,
> Brad
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com