Re: Not all Bucket Shards being used

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



1. I recommend that you *not* issue another bucket reshard until you figure out what’s going on.
2. Which version of Ceph are you using?
3. Can you issue a `radosgw-admin metadata get bucket:<bucket-name>` so we can verify what the current marker is?
4. After you resharded previously, did you get command-line output along the lines of:
	2023-07-24T13:33:50.867-0400 7f10359f2a80 1 execute INFO: reshard of bucket “<bucket-name>" completed successfully

Eric
(he/him)

P.S. It’s likely obvious, but in the above replace <bucket-name> with the actual bucket name.

> On Jul 18, 2023, at 10:18 AM, Christian Kugler <syphdias+ceph@xxxxxxxxx> wrote:
> 
> Hi,
> 
> I have trouble with large OMAP files in a cluster in the RGW index pool. Some
> background information about the cluster: There is CephFS and RBD usage on the
> main cluster but for this issue I think only S3 is interesting.
> There is one realm, one zonegroup with two zones which have a bidirectional sync
> set up. Since this does not allow for autoresharding we have to do it by hand in
> this cluster – looking forward to Reef!
> 
> From the logs:
> cluster 2023-07-17T22:59:03.018722+0000 osd.75 (osd.75) 623978 :
> cluster [WRN] Large omap object found. Object:
> 34:bcec3016:::.dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5:head
> PG: 34.680c373d (34.5) Key count: 962091 Size (bytes): 277963182
> 
> The offending bucket looks like this:
> # radosgw-admin bucket stats \
>    | jq '.[] | select(.marker
> =="3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9")
>              |"\(.num_shards) \(.usage["rgw.main"].num_objects)"' -r
> 131 9463833
> 
> Last week the number of objects was about 12 million. Which is why I reshareded
> the offending bucket twice, I think. Once to 129 and the second time to 131
> because I wanted some leeway (or lieway? scnr, Sage).
> 
> Unfortunately, even after a week the objects were still to big (the log line
> above is quite recent), so I looked into it again.
> 
> # rados -p raum.rgw.buckets.index ls \
>    |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \
>    |sort -V
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.0
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.1
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.2
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.3
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.4
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.6
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.7
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.8
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.9
> .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.10
> # rados -p raum.rgw.buckets.index ls \
>    |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \
>    |sort -V \
>    |xargs -IOMAP sh -c \
>        'rados -p raum.rgw.buckets.index listomapkeys OMAP | wc -l'
> 1013854
> 1011007
> 1012287
> 1011232
> 1013565
> 998262
> 1012777
> 1012713
> 1012230
> 1010690
> 997111
> 
> Apparently, only 11 shards are in use. This would explain why the "Key usage"
> (from the log line) is about ten times higher than I would expect.
> 
> How can I deal with this issue?
> One thing I could try to fix this would be to reshard to a lower number, but I
> am not sure if there are any risks associated with "downsharding". After that I
> could reshard to something like 97. Or I could directly "downshard" to 97.
> 
> Also, the second zone has a similar problem, but as the error messsage lets me
> know, this would be a bad idea. Will it just take more time until the sharding
> is transferred to the seconds zone?
> 
> Best,
> Christian Kugler
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux