Hi Alexander,
so newwest_map looks slowly growing. And (which is worse) oldest_map is
constant. Which means no old map pruning is happening and more and more
maps are coming.
What are the numbers today?
You can assess the number of objects in "meta" pool (that's where
osdmaps are kept) for an OSD by using ceph-objectstore-tool's meta-list
command. This needs OSD to be offline. Then multiply that amount by 4K
(presuming min alloc unit for these OSDs is 4K) and then multiply by
amount of OSDs to learn the minimal space taken by this data. In reality
osdmap can be much(?) larger than 4K but I don't know an easy way to
assess that in Pacific. Quincy has got this patch for
ceph-objectstore-tool to retrieve such object though:
https://github.com/ceph/ceph/pull/39082
And please note that the above doesn't mean there are no other sources
for utilization growth. But apparently it's worth some attention.
Hope this helps,
Igor
On 9/18/2024 9:20 PM, Александр Руденко wrote:
Hi, Igor.
Thank you for your reply!
ceph tell osd.10 status| grep map; echo ---; sleep 600; ceph tell
osd.10 status| grep map
"oldest_map": 2408326,
"newest_map": 2635631,
---
"oldest_map": 2408326,
"newest_map": 2635647,
ceph version is 16.2.13
ceph health:
HEALTH_WARN mons a,b,c,d,e are using a lot of disk space; 8
backfillfull osd(s); 143 nearfull osd(s); Low space hindering backfill
(add storage if this doesn't resolve itself): 25 pgs backfill_toofull;
(muted: BLUESTORE_NO_PER_POOL_OMAP PG_NOT_DEEP_SCRUBBED
POOL_BACKFILLFULL POOL_NEARFULL)
(MUTED, STICKY) [WRN] BLUESTORE_NO_PER_POOL_OMAP: 1756 OSD(s)
reporting legacy (not per-pool) BlueStore omap usage stats
osd.1 legacy (not per-pool) omap detected, suggest to run store
repair to benefit from per-pool omap usage statistics
...
osd.2772 legacy (not per-pool) omap detected, suggest to run
store repair to benefit from per-pool omap usage statistics
[WRN] MON_DISK_BIG: mons a,b,c,d,e are using a lot of disk space
mon.a is 56 GiB >= mon_data_size_warn (15 GiB)
...
[WRN] OSD_BACKFILLFULL: 8 backfillfull osd(s)
osd.12 is backfill full
...
[WRN] OSD_NEARFULL: 143 nearfull osd(s)
osd.1 is near full
...
[WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if
this doesn't resolve itself): 25 pgs backfill_toofull
pg 10.6ea is active+remapped+backfill_toofull, acting [1507,941,2649]
...
(MUTED, STICKY) [WRN] PG_NOT_DEEP_SCRUBBED: 2302 pgs not deep-scrubbed
in time
pg 10.7ffe not deep-scrubbed since 2024-08-23T19:28:22.749150+0300
...
(MUTED, STICKY) [WRN] POOL_BACKFILLFULL: 19 pool(s) backfillfull
pool '.rgw.root' is backfillfull
pool '.rgw.control' is backfillfull
pool '.rgw' is backfillfull
pool '.rgw.gc' is backfillfull
pool '.users.uid' is backfillfull
pool '.users' is backfillfull
pool '.usage' is backfillfull
pool '.intent-log' is backfillfull
pool '.log' is backfillfull
pool '.rgw.buckets' is backfillfull
pool '.rgw.buckets.extra' is backfillfull
pool '.rgw.buckets.index' is backfillfull
pool '.users.email' is backfillfull
pool 'fs1_meta' is backfillfull
pool 'fs1_data' is backfillfull
pool 'fs1_tmp' is backfillfull
pool 'device_health_metrics' is backfillfull
pool 'default.rgw.meta' is backfillfull
ср, 18 сент. 2024 г. в 18:23, Igor Fedotov <igor.fedotov@xxxxxxxx>:
Hi Alexander,
I recall a couple of cases when permanent osdmap epoch growth has
been filling OSD with relevant osd map info. Which could be tricky
to catch.
Please run 'ceph tell osd.N status" for a couple of affected OSDs
twice within e.g. 10 min interval.
Then check the delta between oldest_map and newest_map fields -
neither the delta should be very large (hundreds of thousands) nor
it should grow rapidly within the observed interval.
If so - please share these reports, 'ceph health detailed' output
and exact Ceph release version you're using.
Thanks,
Igor
On 9/18/2024 2:32 PM, Александр Руденко wrote:
erstand, the majority of these pools contain only
--
Igor Fedotov
Ceph Lead Developer
Looking for help with your Ceph cluster? Contact us athttps://croit.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io | YouTube:https://goo.gl/PGE1Bx
--
Igor Fedotov
Ceph Lead Developer
Looking for help with your Ceph cluster? Contact us athttps://croit.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io | YouTube:https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx