Re: 1 Large omap object found

Mark Johnson <markj@xxxxxxxxx> · Wed, 2 Aug 2023 05:44:53 +0000

Never mind, I think I worked it out.  I consulted the Quincy
documentation which just said to do this:

ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 2000000

But when i did that, the health warning didn't clear.  I took a guess that maybe I needed to trigger a deep scrub on that PG as it probably only checks the value against the threshold at scrub time, and when that was done, the health warning has now cleared.  Not sure if that persists across restarts or not but I'll cross that bridge if/when I come to it.

On Wed, 2023-08-02 at 05:31 +0000, Mark Johnson wrote:
> Regarding changing this bvalue back to the previous default of
> 2,000,000, how would I go about doing that?  I tried following that
> SUSE KB article which says to do this:
> 
> ceph tell 'osd.*' injectargs --
> osd_deep_scrub_large_omap_object_key_threshold=2000000
> 
> But while that didn't fail as such, it didn't apply any changes.  Is
> there a way to apply this on the fly without restarting the cluster?
> 
> 
> 
> 
> On Tue, 2023-08-01 at 22:44 +0000, Mark Johnson wrote:
> Thanks for that.  That's pretty much how I was reading it, but the
> text you provided is a lot more explanatory than what I'd managed to
> find and makes it a bit clearer.  Without going into too much detail,
> yes we do have a single user that is used to create multiple a bucket
> for each of a multiple tenants on a daily basis.  So, we'd be
> creating many buckets each day and all owned by the same account. 
> Therefore, it's quite possible that there could be 400,000 buckets
> owned by the one user.  I don't know an easy way to get a figure - I
> tried a "radosgw bucket stats" output to a file but after about 4
> hours it still hadn't returned anything so I gave up.
> 
> I have a feeling that we do have a rolling clean out of objects in
> these buckets, so we might be only keeping 3 months of data for some
> customers, 6 months for others, 12 months for others etc.  But, I
> think one of our guys mentioned that the cleanup might not be getting
> rid of buckets, only the files in them.  So, I may have to get our
> dev guys to revisit this and see if we can clean up a crapload of
> empty buckets.
> 
> 
> On Tue, 2023-08-01 at 08:37 +0000, Eugen Block wrote:
> Thanks. Just for reference I'm quoting the SUSE doc [1] you mentioned
> because it explains what you already summarized:
> 
> User indices are not sharded, in other words we store all the keys
> of names of buckets under one object. This can cause large objects
> to be found. The large object is only accessed in the List All
> Buckets S3/Swift API. Unlike bucket indices, the large object is not
> exactly in the object IO path. Depending on the use case for so many
> buckets, the warning isn't dangerous as the large object is only
> used for the List All Buckets API.
> The error shows a user has 500K buckets causing the omap issue.
> Sharding does not occur at the user level. Bucket indexes are
> sharded but buckets per user is not (and usually the default
> max_bucket is 1000).
> 
> Does this mean that you actually have a user with around 400k
> buckets?
> If you can't delete unused buckets (you already ruled out creating
> multiple users) there's probably no way around increasing the
> threshold, I guess. I'm not the biggest RGW expert but we have a few
> customers where the threshold was actually increased to the previous
> default to get rid of the warning (if other actions were not
> possible). So far we didn't get any reports causing any issue at all.
> But I'd be curious if the devs or someone with more experience has a
> better advice.
> 
> [1] https://www.suse.com/support/kb/doc/?id=000019698
> 
> Zitat von Mark Johnson
> <markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><mailto:markj@xxxxxxxxx<mailt
> o:markj@xxxxxxxxx>>>:
> 
> Here you go.  It doesn't format very well, so I'll summarize what I'm
> seeing.
> 
> 5.c has 78051 OMAP_BYTES and 398 OMAP_KEYS
> 5.16 has 80186950 OMAP_BYTES and 401505 OMAP_KEYS
> 
> The remaining 30 PGS have zero of both.  However, the BYTES for each
> PG
> is very much the same at around 8900000 for each.
> 
> 
> # ceph pg ls-by-pool default.rgw.meta
> 
> PG    OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES    OMAP_BYTES*
> OMAP_KEYS*  LOG    STATE                        SINCE  VERSION
> REPORTED        UP             ACTING         SCRUB_STAMP
> DEEP_SCRUB_STAMP                 LAST_SCRUB_DURATION 
> SCRUB_SCHEDULING
> 5.0     26240         0          0        0  8909864            0
> 0  10076                 active+clean    10h      8093'54176
> 8093:5396520   [21,4,12]p21   [21,4,12]p21  2023-07-
> 31T21:13:20.554485+0000  2023-07-26T03:40:27.457946+0000
> 5  periodic scrub scheduled @ 2023-08-01T23:55:14.134653+0000
> 5.1     26065         0          0        0  8840849            0
> 0  10029                 active+clean    10h      8093'56529
> 8093:4891333   [14,7,23]p14   [14,7,23]p14  2023-07-
> 31T20:37:34.920128+0000  2023-07-30T10:55:16.529046+0000
> 5  periodic scrub scheduled @ 2023-08-01T21:12:04.440688+0000
> 5.2     26406         0          0        0  8943783            0
> 0  10076                 active+clean    20h      8093'56776
> 8093:5022283   [26,8,25]p26   [26,8,25]p26  2023-07-
> 31T11:08:00.886979+0000  2023-07-30T06:03:44.341435+0000
> 5  periodic scrub scheduled @ 2023-08-01T16:02:32.076634+0000
> 5.3     26250         0          0        0  8932714            0
> 0  10086                 active+clean    20h      8093'56786
> 8093:5109316    [0,26,32]p0    [0,26,32]p0  2023-07-
> 31T11:02:35.864452+0000  2023-07-30T04:25:56.495524+0000
> 5  periodic scrub scheduled @ 2023-08-01T20:18:30.975924+0000
> 5.4     26071         0          0        0  8874237            0
> 0  10024                 active+clean     6h      8092'53824
> 8093:5146409   [15,7,34]p15   [15,7,34]p15  2023-08-
> 01T01:16:48.361184+0000  2023-07-25T15:47:10.627594+0000
> 5  periodic scrub scheduled @ 2023-08-02T12:32:06.359395+0000
> 5.5     26160         0          0        0  8870317            0
> 0  10073                 active+clean    12h      8093'56173
> 8093:4706658    [9,31,16]p9    [9,31,16]p9  2023-07-
> 31T18:52:26.301525+0000  2023-07-29T08:19:00.537322+0000
> 5  periodic scrub scheduled @ 2023-08-02T02:11:16.267794+0000
> 5.6     26186         0          0        0  8904446            0
> 0  10084                 active+clean    44m      8093'57584
> 8093:5032349    [7,10,38]p7    [7,10,38]p7  2023-08-
> 01T06:37:45.184419+0000  2023-08-01T06:37:45.184419+0000
> 313  periodic scrub scheduled @ 2023-08-02T10:01:32.285716+0000
> 5.7     26292         0          0        0  8908213            0
> 0   9695                 active+clean    87m      8093'56896
> 8093:4969718   [36,1,13]p36   [36,1,13]p36  2023-08-
> 01T05:55:06.016287+0000  2023-07-30T21:49:33.028594+0000
> 5  periodic scrub scheduled @ 2023-08-02T13:54:38.778542+0000
> 5.8     26323         0          0        0  8911110            0
> 0   9747                 active+clean     3h      8093'56448
> 8093:4981465   [36,15,2]p36   [36,15,2]p36  2023-08-
> 01T04:21:06.360778+0000  2023-07-29T14:46:02.363530+0000
> 5  periodic scrub scheduled @ 2023-08-02T10:36:48.764085+0000
> 5.9     26035         0          0        0  8829430            0
> 0  10034                 active+clean    20h      8093'56335
> 8093:4829155  [37,21,24]p37  [37,21,24]p37  2023-07-
> 31T11:07:39.961751+0000  2023-07-31T11:07:39.961751+0000
> 309  periodic scrub scheduled @ 2023-08-01T22:42:18.862879+0000
> 5.a     26052         0          0        0  8859067            0
> 0  10087                 active+clean    27h      8092'56087
> 8093:5022933    [2,23,10]p2    [2,23,10]p2  2023-07-
> 31T03:28:44.433360+0000  2023-07-31T03:28:44.433360+0000
> 248  periodic scrub scheduled @ 2023-08-01T13:36:39.897693+0000
> 5.b     25759         0          0        0  8739834            0
> 0   9693                 active+clean    15h      8090'56293
> 8093:4837010   [36,7,28]p36   [36,7,28]p36  2023-07-
> 31T15:55:00.415967+0000  2023-07-31T15:55:00.415967+0000
> 323  periodic scrub scheduled @ 2023-08-01T23:41:03.756058+0000
> 5.c     25927         0          0        0  8788271        78051
> 398  10051                 active+clean    24h     8093'174851
> 8093:4982667    [5,36,18]p5    [5,36,18]p5  2023-07-
> 31T07:20:32.208533+0000  2023-07-31T07:20:32.208533+0000
> 315  periodic scrub scheduled @ 2023-08-01T18:18:15.292651+0000
> 5.d     25995         0          0        0  8815306            0
> 0  10070                 active+clean     2h      8093'57270
> 8093:4994478  [32,13,16]p32  [32,13,16]p32  2023-08-
> 01T04:27:55.863933+0000  2023-08-01T04:27:55.863933+0000
> 294  periodic scrub scheduled @ 2023-08-02T05:55:30.108279+0000
> 5.e     26253         0          0        0  8939984            0
> 0  10018                 active+clean     5h      8092'56919
> 8093:5135033   [37,19,4]p37   [37,19,4]p37  2023-08-
> 01T01:38:15.740983+0000  2023-07-30T21:55:45.349878+0000
> 5  periodic scrub scheduled @ 2023-08-02T04:40:09.172157+0000
> 5.f     26025         0          0        0  8821973            0
> 0  10020                 active+clean    25h      8093'53120
> 8093:4794909   [14,11,8]p14   [14,11,8]p14  2023-07-
> 31T06:22:22.849194+0000  2023-07-25T02:42:24.135997+0000
> 5  periodic deep scrub scheduled @ 2023-08-01T13:50:38.914828+0000
> 5.10    25999         0          0        0  8821303            0
> 0  10048                 active+clean     3h      8092'55848
> 8093:5151525  [39,27,14]p39  [39,27,14]p39  2023-08-
> 01T03:40:44.355521+0000  2023-07-29T23:58:49.567904+0000
> 5  periodic scrub scheduled @ 2023-08-02T04:44:52.459615+0000
> 5.11    26200         0          0        0  8897148            0
> 0   8909                 active+clean    23h      8093'56309
> 8093:4858657  [35,24,23]p35  [35,24,23]p35  2023-07-
> 31T08:19:21.090885+0000  2023-07-30T07:21:21.135342+0000
> 5  periodic scrub scheduled @ 2023-08-01T13:09:38.620801+0000
> 5.12    26043         0          0        0  8803496            0
> 0  10016                 active+clean    33h      8093'52716
> 8093:5090415   [21,35,3]p21   [21,35,3]p21  2023-07-
> 30T22:10:31.308788+0000  2023-07-24T14:40:34.453392+0000
> 5  periodic deep scrub scheduled @ 2023-08-01T09:36:02.058413+0000
> 5.13    25929         0          0        0  8785411            0
> 0  10029                 active+clean    16h      8090'54629
> 8093:5096641   [32,17,9]p32   [32,17,9]p32  2023-07-
> 31T15:19:04.119491+0000  2023-07-27T16:36:53.401620+0000
> 5  periodic scrub scheduled @ 2023-08-01T22:19:03.573063+0000
> 5.14    26102         0          0        0  8858274            0
> 0  10069                 active+clean    89m      8093'54671
> 8093:4958083    [3,29,12]p3    [3,29,12]p3  2023-08-
> 01T05:53:20.722831+0000  2023-07-27T11:29:26.930179+0000
> 5  periodic scrub scheduled @ 2023-08-02T14:03:11.521746+0000
> 5.15    26122         0          0        0  8850708            0
> 0   9419                 active+clean    14h      8093'57119
> 8093:4854254   [28,8,29]p28   [28,8,29]p28  2023-07-
> 31T17:04:04.790500+0000  2023-07-31T17:04:04.790500+0000
> 309  periodic scrub scheduled @ 2023-08-02T04:29:01.728903+0000
> 5.16    26168         0          0        0  8869093     80186950
> 401505  10031                 active+clean    32h  8093'122396977
> 8093:127435157   [26,39,9]p26   [26,39,9]p26  2023-07-
> 30T23:19:12.784044+0000  2023-07-30T23:19:12.784044+0000
> 258  periodic scrub scheduled @ 2023-08-01T10:29:11.462563+0000
> 5.17    25504         0          0        0  8634818            0
> 0  10081                 active+clean    23h      8093'55481
> 8093:4742014    [4,25,34]p4    [4,25,34]p4  2023-07-
> 31T08:11:23.105601+0000  2023-07-31T08:11:23.105601+0000
> 309  periodic scrub scheduled @ 2023-08-01T15:11:19.240302+0000
> 5.18    26143         0          0        0  8846680            0
> 0  10014                 active+clean    23h      8093'55015
> 8093:4927120  [22,11,36]p22  [22,11,36]p22  2023-07-
> 31T08:18:27.757381+0000  2023-07-27T19:04:02.036522+0000
> 5  periodic scrub scheduled @ 2023-08-01T16:25:12.741673+0000
> 5.19    26117         0          0        0  8864860            0
> 0  10073                 active+clean    27h      8093'55173
> 8093:5001362    [1,28,27]p1    [1,28,27]p1  2023-07-
> 31T03:37:40.525594+0000  2023-07-28T10:50:18.232627+0000
> 5  periodic scrub scheduled @ 2023-08-01T15:06:48.300456+0000
> 5.1a    26186         0          0        0  8870466            0
> 0  10025                 active+clean    31h      8093'54025
> 8093:4991279  [34,17,22]p34  [34,17,22]p34  2023-07-
> 31T00:20:28.853158+0000  2023-07-25T21:31:55.045662+0000
> 5  periodic scrub scheduled @ 2023-08-01T09:42:20.646380+0000
> 5.1b    26070         0          0        0  8854703            0
> 0  10087                 active+clean     8h      8093'56487
> 8093:4886996   [22,7,37]p22   [22,7,37]p22  2023-07-
> 31T22:39:17.793412+0000  2023-07-30T16:07:40.211725+0000
> 6  periodic scrub scheduled @ 2023-08-01T23:23:58.360930+0000
> 5.1c    26302         0          0        0  8925675            0
> 0  10015                 active+clean    16h      8093'54915
> 8093:4986627   [33,11,2]p33   [33,11,2]p33  2023-07-
> 31T15:07:27.474683+0000  2023-07-27T15:11:42.794360+0000
> 5  periodic scrub scheduled @ 2023-08-01T15:51:01.248015+0000
> 5.1d    26075         0          0        0  8839499            0
> 0  10075                 active+clean    21h      8093'52575
> 8093:4857107  [33,16,17]p33  [33,16,17]p33  2023-07-
> 31T09:23:24.577919+0000  2023-07-24T16:52:37.965968+0000
> 5  periodic deep scrub scheduled @ 2023-08-01T11:33:35.067584+0000
> 5.1e    26182         0          0        0  8905099            0
> 0   8905  active+clean+scrubbing+deep     4m      8093'56905
> 8093:4936987  [35,20,15]p35  [35,20,15]p35  2023-07-
> 31T04:25:51.171690+0000  2023-07-24T14:28:19.531701+0000
> 5  deep scrubbing for 259s
> 5.1f    25978         0          0        0  8796478            0
> 0  10068                 active+clean    13h      8092'56868
> 8093:4813791  [26,30,13]p26  [26,30,13]p26  2023-07-
> 31T17:50:40.349450+0000  2023-07-31T17:50:40.349450+0000
> 311  periodic scrub scheduled @ 2023-08-02T04:39:41.913504+0000
> 
> 
> On Tue, 2023-08-01 at 06:14 +0000, Eugen Block wrote:
> Yeah, regarding data distribution increasing the pg_num of the data
> pool is recommended. But could you also share the output of:
> 
> ceph pg ls-by-pool default.rgw.meta
> 
> That's where the large omap was reported, maybe you'll need to
> increase the pg_num for that pool as well. Personally, I always
> disable the autoscaler.
> 
> Zitat von Mark Johnson
> <markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><mailto:markj@xxxxxxxxx<mailt
> o:markj@xxxxxxxxx>>>:
> 
> Thanks Bailey,
> 
> With regards to the PG count, we've been relying on PG autoscale
> and
> it is currently enabled.  I figure I'd need to disable autoscale
> and
> manually increase the PG count on the default.rgw.buckets.data
> pool,
> correct?  We're coming from our existing clusters running Jewel to
> this new Quincy cluster and have no prior experience with
> autoscale
> so we were just assuming autoscale would manage PG counts better
> than us doing it manually.  As you can probably guess, we don't
> have
> much experience with Ceph.
> 
> Regards,
> Mark Johnson
> 
> 
> On Mon, 2023-07-31 at 21:54 -0300, Bailey Allison wrote:
> [You don't often get email from
> ballison@xxxxxxxxxxxx<mailto:ballison@xxxxxxxxxxxx><mailto:ballison@4
> 5drives.com<mailto:ballison@xxxxxxxxxxxx>><mailto:ballison@45drives.c
> om<mailto:ballison@xxxxxxxxxxxx><mailto:ballison@xxxxxxxxxxxx<mailto:
> ballison@xxxxxxxxxxxx>>>. Learn why
> this
> is important at https://aka.ms/LearnAboutSenderIdentification ;]
> 
> Hi,
> 
> It appears you have quite a low PG count on your cluster (approx.
> 20
> PGs per each OSD).
> 
> Usually is recommended to have about 100-150 per each OSD. With a
> lower PG count you can have issues with balancing data and cause
> errors such as large OMAP objects.
> 
> Might not be the fix in this case but either way would still
> recommend increasing PGs on your pools.
> 
> If you look at the OMAP value in your ceph osd df you can see that
> some OSDs have 2GB while some have 500MB. Even for data some
> drives
> are holding 900GB while others 2TB.
> 
> You will have to issue a deep-scrub on the PGs as well to get
> updated OMAP data once the PGs are increased.
> 
> Regards,
> 
> Bailey
> 
> -----Original Message-----
> From: Mark Johnson
> <markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><mailto:markj@xxxxxxxxx<mailt
> o:markj@xxxxxxxxx>><mailto:markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><ma
> ilto:markj@xxxxxxxxx<mailto:markj@xxxxxxxxx>>>>
> Sent: July 31, 2023 9:01 PM
> To:
> eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@xxxxxx<mailto:ebloc
> k@xxxxxx>><mailto:eblock@xxxxxx<mailto:eblock@xxxxxx><mailto:eblock@n
> de.ag<mailto:eblock@xxxxxx>>>;
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>>>
> Subject:  Re: 1 Large omap object found
> 
> Sure thing.  Thanks for the reply.
> 
> ceph df
> 
> --- RAW STORAGE ---
> CLASS     SIZE    AVAIL    USED  RAW USED  %RAW USED
> hdd    291 TiB  244 TiB  47 TiB    47 TiB      16.02
> TOTAL  291 TiB  244 TiB  47 TiB    47 TiB      16.02
> 
> --- POOLS ---
> POOL                       ID  PGS   STORED  OBJECTS     USED
> %USED
> MAX AVAIL
> .mgr                        1    1  459 MiB      116  1.3 GiB
> 0
> 65 TiB
> .rgw.root                   2   32  1.3 KiB        4   48 KiB
> 0
> 65 TiB
> default.rgw.log             3   32  5.3 KiB      209  468 KiB
> 0
> 65 TiB
> default.rgw.control         4   32      0 B        8      0 B
> 0
> 65 TiB
> default.rgw.meta            5   32  452 MiB  828.75k   10 GiB
> 0
> 65 TiB
> default.rgw.buckets.index   6   32   17 GiB    4.56M   51 GiB
> 0.03
> 65 TiB
> default.rgw.buckets.data    7  128   15 TiB   54.51M   46 TiB
> 19.24
> 65 TiB
> cephfs_metadata             8   16  258 MiB       98  775 MiB
> 0
> 65 TiB
> cephfs_data                 9   32  1.9 GiB      998  5.6 GiB
> 0
> 65 TiB
> 
> 
> ceph osd df
> 
> ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP
> META
> AVAIL    %USE   VAR   PGS  STATUS
> 0    hdd  7.27739   1.00000  7.3 TiB  1.6 TiB  1.6 TiB   550 MiB
> 12
> GiB  5.7 TiB  21.70  1.35   21      up
> 1    hdd  7.27739   1.00000  7.3 TiB  995 GiB  986 GiB   1.1 GiB
> 7.6
> GiB  6.3 TiB  13.35  0.83   28      up
> 2    hdd  7.27739   1.00000  7.3 TiB  996 GiB  986 GiB   2.1 GiB
> 7.9
> GiB  6.3 TiB  13.37  0.83   22      up
> 3    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   513 MiB
> 10
> GiB  5.9 TiB  18.35  1.15   28      up
> 4    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   527 MiB
> 8.3
> GiB  6.2 TiB  15.02  0.94   22      up
> 5    hdd  7.27739   1.00000  7.3 TiB  1.8 TiB  1.8 TiB   1.5 GiB
> 14
> GiB  5.5 TiB  25.01  1.56   28      up
> 6    hdd  7.27739   1.00000  7.3 TiB  746 GiB  739 GiB   1.0 GiB
> 5.8
> GiB  6.5 TiB  10.01  0.63   20      up
> 7    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   1.1 GiB
> 8.8
> GiB  6.2 TiB  15.04  0.94   20      up
> 8    hdd  7.27739   1.00000  7.3 TiB  871 GiB  864 GiB   544 MiB
> 6.7
> GiB  6.4 TiB  11.69  0.73   27      up
> 9    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   3.3 GiB
> 11
> GiB  5.9 TiB  18.37  1.15   28      up
> 30    hdd  7.27739   1.00000  7.3 TiB  1.8 TiB  1.8 TiB   1.6 GiB
> 14
> GiB  5.5 TiB  25.01  1.56   35      up
> 31    hdd  7.27739   1.00000  7.3 TiB  747 GiB  739 GiB   2.2 GiB
> 6.2
> GiB  6.5 TiB  10.03  0.63   20      up
> 32    hdd  7.27739   1.00000  7.3 TiB  996 GiB  987 GiB   1.5 GiB
> 7.9
> GiB  6.3 TiB  13.37  0.83   26      up
> 33    hdd  7.27739   1.00000  7.3 TiB  995 GiB  985 GiB   1.5 GiB
> 7.7
> GiB  6.3 TiB  13.35  0.83   25      up
> 34    hdd  7.27739   1.00000  7.3 TiB  750 GiB  742 GiB   2.1 GiB
> 5.7
> GiB  6.5 TiB  10.07  0.63   25      up
> 35    hdd  7.27739   1.00000  7.3 TiB  2.1 TiB  2.0 TiB   571 MiB
> 15
> GiB  5.2 TiB  28.36  1.77   34      up
> 36    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   1.5 GiB
> 10
> GiB  5.9 TiB  18.37  1.15   31      up
> 37    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   524 MiB
> 8.2
> GiB  6.2 TiB  14.99  0.94   26      up
> 38    hdd  7.27739   1.00000  7.3 TiB  1.6 TiB  1.6 TiB   1.6 GiB
> 12
> GiB  5.7 TiB  21.70  1.35   28      up
> 39    hdd  7.27739   1.00000  7.3 TiB  1.5 TiB  1.4 TiB   2.4 GiB
> 11
> GiB  5.8 TiB  20.04  1.25   30      up
> 10    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   1.6 GiB
> 10
> GiB  5.9 TiB  18.34  1.14   26      up
> 12    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB     1 KiB
> 9.9
> GiB  5.9 TiB  18.37  1.15   25      up
> 14    hdd  7.27739   1.00000  7.3 TiB  1.5 TiB  1.4 TiB   593 MiB
> 10
> GiB  5.8 TiB  19.98  1.25   22      up
> 16    hdd  7.27739   1.00000  7.3 TiB  997 GiB  987 GiB   2.2 GiB
> 7.5
> GiB  6.3 TiB  13.38  0.84   19      up
> 18    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   1.1 GiB
> 8.7
> GiB  6.2 TiB  15.02  0.94   26      up
> 20    hdd  7.27739   1.00000  7.3 TiB  1.6 TiB  1.6 TiB   1.1 GiB
> 12
> GiB  5.7 TiB  21.68  1.35   26      up
> 22    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   559 MiB
> 10
> GiB  5.9 TiB  18.34  1.14   22      up
> 24    hdd  7.27739   1.00000  7.3 TiB  872 GiB  864 GiB  1020 MiB
> 6.8
> GiB  6.4 TiB  11.70  0.73   23      up
> 26    hdd  7.27739   1.00000  7.3 TiB  749 GiB  741 GiB   1.8 GiB
> 6.3
> GiB  6.5 TiB  10.05  0.63   25      up
> 28    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   1.5 GiB
> 10
> GiB  5.9 TiB  18.36  1.15   32      up
> 11    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   2.6 GiB
> 8.5
> GiB  6.2 TiB  15.02  0.94   23      up
> 13    hdd  7.27739   1.00000  7.3 TiB  1.3 TiB  1.3 TiB   2.2 GiB
> 10
> GiB  5.9 TiB  18.38  1.15   36      up
> 15    hdd  7.27739   1.00000  7.3 TiB  995 GiB  986 GiB   1.1 GiB
> 7.7
> GiB  6.3 TiB  13.35  0.83   25      up
> 17    hdd  7.27739   1.00000  7.3 TiB  623 GiB  618 GiB   419 KiB
> 5.0
> GiB  6.7 TiB   8.35  0.52   23      up
> 19    hdd  7.27739   1.00000  7.3 TiB  870 GiB  863 GiB   513 MiB
> 6.6
> GiB  6.4 TiB  11.67  0.73   21      up
> 21    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   1.5 GiB
> 8.6
> GiB  6.2 TiB  15.02  0.94   25      up
> 23    hdd  7.27739   1.00000  7.3 TiB  746 GiB  739 GiB   564 MiB
> 5.8
> GiB  6.5 TiB  10.01  0.62   22      up
> 25    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   2.1 GiB
> 8.4
> GiB  6.2 TiB  15.03  0.94   24      up
> 27    hdd  7.27739   1.00000  7.3 TiB  1.2 TiB  1.2 TiB   532 MiB
> 9.1
> GiB  6.1 TiB  16.68  1.04   23      up
> 29    hdd  7.27739   1.00000  7.3 TiB  1.1 TiB  1.1 TiB   1.1 GiB
> 8.4
> GiB  6.2 TiB  14.99  0.94   19      up
>                       TOTAL  291 TiB   47 TiB   46 TiB    51 GiB
> 359
> GiB  244 TiB  16.02
> MIN/MAX VAR: 0.52/1.77  STDDEV: 4.56
> 
> 
> On Mon, 2023-07-31 at 09:22 +0000, Eugen Block wrote:
> Hi,
> 
> can you share some more details like 'ceph df' and 'ceph osd df'? I
> don't have too much advice yet, but to see all entries in your meta
> pool you need add the --all flag because those objects are stored
> in
> namespaces:
> 
> rados -p default.rgw.meta ls --all
> 
> That pool contains user and bucket information (example):
> 
> # rados -p default.rgw.meta ls --all
> users.uid       admin.buckets
> users.keys      c0fba3ea7d9c4321b5205752c85baa85 users.uid
> admin
> users.keys      JBWPRAPP1AQG471AMGC4 users.uid
> e434b82737cf4138b899c0785b49112d.buckets
> users.uid       e434b82737cf4138b899c0785b49112d
> 
> 
> 
> Zitat von Mark Johnson
> <markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><mailto:markj@xxxxxxxxx<mailt
> o:markj@xxxxxxxxx>><mailto:markj@xxxxxxxxx<mailto:markj@xxxxxxxxx><ma
> ilto:markj@xxxxxxxxx<mailto:markj@xxxxxxxxx>>>>:
> 
> I've been going round and round in circles trying to work this one
> out but I'm getting nowhere.  We're running a 4 node quincy cluster
> (17.2.6) which recently reported the following:
> 
> ceph.log-20230729.gz:2023-07-28T08:31:42.390003+0000 osd.26
> (osd.26)
> 13834 : cluster [WRN] Large omap object found. Object:
> 5:6c65dd84:users.uid::callrecordings$callrecordings_rw.buckets:head
> 
> PG: 5.21bba636 (5.16) Key count: 378454 Size (bytes): 75565579
> 
> This happened a week or so ago (only the key count was only just
> over the 200000 threshold on that occasion) and after much
> searching
> around, I found an article that suggested a deep scrub on the pg
> would likely resolve the issue, so I forced a deep scrub and
> shortly
> after, the warning cleared.  Came into the office today to discover
> the above.  It's on the same PG as before which is in the
> default.rgw.meta pool.  This time, after forcing a deep-scrub on
> that PG, nothing changed.  I did it a second time just to be sure
> but got the same result.
> 
> I keep finding a suse article that simply suggests increasing the
> threshold to the previous default of 2,000,000, but other articles
> I
> read say it was lowered for a reason and that by the time it hits
> that figure, it's too late so I don't want to just mask it.
> Problem
> is that I don't really understand it.   I found a thread here from
> a
> bit over two years ago but their issue was in the
> default.rgw.buckets.index pool.  A step in the solution was to list
> out the problematic object id and check the objects per shard
> however, if I issue the command "rados -p default.rgw.meta ls" it
> returns nothing.  I get a big list from "rados -p
> default.rgw.buckets.index ls" just nothing from the first pool.  I
> think it may be because the meta pool isn't indexed based on
> something I read, but I really don't know what I'm talking about
> tbh.
> 
> I don't know if this is helpful, but if I list out all the PGs for
> that pool, there are 32 PGs and 5.16 shows 80186950 bytes and
> 401505
> keys.  PG 5.c has 75298 and 384 keys.  The remaining 30 PGs show
> zero bytes and zero keys.  I'm really not sure how to troubleshoot
> and resolve from here.  For the record, dynamic resharding is
> enabled in that no options have been set in the config and that is
> the default setting.
> 
> Based on the suse article I mentioned which also references the
> default.rgw.meta pool, I'm gathering our issue is because we have
> so
> many buckets that are all owned by the one user and the solution is
> either:
> 
> * delete unused buckets
> * create multiple users and spread buckets evenly across all users
> (not something we can do)
> * increase the threshold to stop the warning
> 
> Problem is that I'm having trouble verifying this is the issue.
> I've
> tried dumping out bucket stats to a file (radosgw-admin bucket
> stats
> bucket_stats.txt) but after three hours this is still running with
> no output.
> 
> Thanks for your time,
> Mark
> _______________________________________________
> ceph-users mailing list --
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>>> To unsubscribe send
> an
> email to ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>>>
> 
> 
> _______________________________________________
> ceph-users mailing list --
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>>> To unsubscribe send
> an
> email to ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>>>
> 
> _______________________________________________
> ceph-users mailing list --
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx><mailto:ceph-
> users@xxxxxxx<mailto:ceph-users@xxxxxxx>>> To unsubscribe send
> an
> email to ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx><mailto:ceph-users-leave@xxxxxxx<mailto:ceph-users-
> leave@xxxxxxx>>>
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-
> users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-
> users-leave@xxxxxxx>
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx