Re: High usage (DATA column) on dedicated for OMAP only OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alexander,

so newwest_map looks slowly growing. And (which is worse) oldest_map is constant. Which means no old map pruning is happening and more and more maps are coming.

What are the numbers today?

You can assess the number of objects in "meta" pool (that's where osdmaps are kept) for an OSD by using ceph-objectstore-tool's meta-list command. This needs OSD to be offline. Then multiply that amount by 4K (presuming min alloc unit for these OSDs is 4K) and then multiply by amount of OSDs to learn the minimal space taken by this data. In reality osdmap can be much(?) larger than 4K but I don't know an easy way to assess that in Pacific. Quincy has got this patch for ceph-objectstore-tool to retrieve such object though: https://github.com/ceph/ceph/pull/39082


And please note that the above doesn't mean there are no other sources for utilization growth. But apparently it's worth some attention.


Hope this helps,

Igor

On 9/18/2024 9:20 PM, Александр Руденко wrote:
Hi, Igor.

Thank you for your reply!

ceph tell osd.10 status| grep map; echo ---; sleep 600; ceph tell osd.10 status| grep map
    "oldest_map": 2408326,
    "newest_map": 2635631,
---
    "oldest_map": 2408326,
    "newest_map": 2635647,

ceph version is 16.2.13

ceph health:

HEALTH_WARN mons a,b,c,d,e are using a lot of disk space; 8 backfillfull osd(s); 143 nearfull osd(s); Low space hindering backfill (add storage if this doesn't resolve itself): 25 pgs backfill_toofull; (muted: BLUESTORE_NO_PER_POOL_OMAP PG_NOT_DEEP_SCRUBBED POOL_BACKFILLFULL POOL_NEARFULL) (MUTED, STICKY) [WRN] BLUESTORE_NO_PER_POOL_OMAP: 1756 OSD(s) reporting legacy (not per-pool) BlueStore omap usage stats      osd.1 legacy (not per-pool) omap detected, suggest to run store repair to benefit from per-pool omap usage statistics
     ...
     osd.2772 legacy (not per-pool) omap detected, suggest to run store repair to benefit from per-pool omap usage statistics
[WRN] MON_DISK_BIG: mons a,b,c,d,e are using a lot of disk space
    mon.a is 56 GiB >= mon_data_size_warn (15 GiB)
    ...
[WRN] OSD_BACKFILLFULL: 8 backfillfull osd(s)
    osd.12 is backfill full
    ...
[WRN] OSD_NEARFULL: 143 nearfull osd(s)
    osd.1 is near full
    ...
[WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 25 pgs backfill_toofull
    pg 10.6ea is active+remapped+backfill_toofull, acting [1507,941,2649]
    ...
(MUTED, STICKY) [WRN] PG_NOT_DEEP_SCRUBBED: 2302 pgs not deep-scrubbed in time
    pg 10.7ffe not deep-scrubbed since 2024-08-23T19:28:22.749150+0300
    ...
(MUTED, STICKY) [WRN] POOL_BACKFILLFULL: 19 pool(s) backfillfull
    pool '.rgw.root' is backfillfull
    pool '.rgw.control' is backfillfull
    pool '.rgw' is backfillfull
    pool '.rgw.gc' is backfillfull
    pool '.users.uid' is backfillfull
    pool '.users' is backfillfull
    pool '.usage' is backfillfull
    pool '.intent-log' is backfillfull
    pool '.log' is backfillfull
    pool '.rgw.buckets' is backfillfull
    pool '.rgw.buckets.extra' is backfillfull
    pool '.rgw.buckets.index' is backfillfull
    pool '.users.email' is backfillfull
    pool 'fs1_meta' is backfillfull
    pool 'fs1_data' is backfillfull
    pool 'fs1_tmp' is backfillfull
    pool 'device_health_metrics' is backfillfull
    pool 'default.rgw.meta' is backfillfull

ср, 18 сент. 2024 г. в 18:23, Igor Fedotov <igor.fedotov@xxxxxxxx>:

    Hi Alexander,

    I recall a couple of cases when permanent osdmap epoch growth has
    been filling OSD with relevant osd map info. Which could be tricky
    to catch.

    Please run 'ceph tell osd.N status" for a couple of affected OSDs
    twice within e.g. 10 min interval.

    Then check the delta between oldest_map and newest_map fields -
    neither the delta should be very large (hundreds of thousands) nor
    it should grow rapidly within the observed interval.

    If so - please share these reports, 'ceph health detailed' output
    and exact Ceph release version you're using.


    Thanks,

    Igor


    On 9/18/2024 2:32 PM, Александр Руденко wrote:
    erstand, the majority of these pools contain only

-- Igor Fedotov
    Ceph Lead Developer

    Looking for help with your Ceph cluster? Contact us athttps://croit.io

    croit GmbH, Freseniusstr. 31h, 81247 Munich
    CEO: Martin Verges - VAT-ID: DE310638492
    Com. register: Amtsgericht Munich HRB 231263
    Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux