Hi, On the adopted cluster Prometheus was triggered for "osd full > 90%" But Ceph itself - not. Actually OSD is drained (see %USE). root@host# ceph osd df name osd.696 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 696 nvme 0.91199 1.00000 912 GiB 830 GiB 684 GiB 8 KiB 146 GiB 81 GiB 91.09 1.00 47 up TOTAL 912 GiB 830 GiB 684 GiB 8.1 KiB 146 GiB 81 GiB 91.09 MIN/MAX VAR: 1.00/1.00 STDDEV: 0 root@host# ceph osd df name osd.696 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 696 nvme 0.91199 1.00000 912 GiB 830 GiB 684 GiB 8 KiB 146 GiB 81 GiB 91.08 1.00 47 up TOTAL 912 GiB 830 GiB 684 GiB 8.1 KiB 146 GiB 81 GiB 91.08 MIN/MAX VAR: 1.00/1.00 STDDEV: 0 root@host# ceph osd df name osd.696 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 696 nvme 0.91199 1.00000 912 GiB 830 GiB 684 GiB 8 KiB 146 GiB 81 GiB 91.07 1.00 47 up TOTAL 912 GiB 830 GiB 684 GiB 8.1 KiB 146 GiB 81 GiB 91.07 MIN/MAX VAR: 1.00/1.00 STDDEV: 0 Pool 18 is another class pool, OSD's of this pool triggered as usual, but for pool 17 - don't. root@host# ceph health detail HEALTH_WARN noout flag(s) set; Some pool(s) have the nodeep-scrub flag(s) set; Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull OSDMAP_FLAGS noout flag(s) set POOL_SCRUB_FLAGS Some pool(s) have the nodeep-scrub flag(s) set Pool meta_ru1b has nodeep-scrub flag Pool data_ru1b has nodeep-scrub flag PG_BACKFILL_FULL Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull pg 18.1008 is active+remapped+backfill_wait+backfill_toofull, acting [336,462,580] pg 18.27e0 is active+remapped+backfill_wait+backfill_toofull, acting [401,627,210] On my experience , Ceph triggers when OSD drain on backfillfull_ratio, then on nearfull_ratio until usage will drops to 84.99% I don't think is to possible to configure silence for this Current usage: root@host# ceph df detail RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 4.3 PiB 1022 TiB 3.3 PiB 3.3 PiB 76.71 nvme 161 TiB 61 TiB 82 TiB 100 TiB 62.30 TOTAL 4.4 PiB 1.1 PiB 3.4 PiB 3.4 PiB 76.20 POOLS: POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR meta_ru1b 17 2048 3.1 TiB 7.15G 82 TiB 92.77 2.1 TiB N/A N/A 7.15G 0 B 0 B data_ru1b 18 16384 1.1 PiB 3.07G 3.3 PiB 88.29 148 TiB N/A N/A 3.07G 0 B 0 B Current OSD dump header: epoch 270540 fsid ccf2c233-4adf-423c-b734-236220096d4e created 2019-02-14 15:30:56.642918 modified 2021-04-21 20:33:54.481616 flags noout,sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 7255 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 require_min_compat_client jewel min_compat_client jewel require_osd_release nautilus pool 17 'meta_ru1b' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 2048 pgp_num 2048 autoscale_mode warn last_change 240836 lfor 0/0/51990 flags hashpspool,nodeep-scrub stripe_width 0 application metadata pool 18 'data_ru1b' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16384 pgp_num 16384 autoscale_mode warn last_change 270529 lfor 0/0/52038 flags hashpspool,nodeep-scrub stripe_width 0 application data max_osd 780 Current versions: { "mon": { "ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11) nautilus (stable)": 3 }, "osd": { "ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11) nautilus (stable)": 780 }, "mds": {}, "overall": { "ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11) nautilus (stable)": 786 } } Dan, maybe there was something like that in your memory? My guess is that some counter type overflowed Thanks, k _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx