Hi, We appear to be stuck in a proverbial chicken and egg situation. Degraded placement groups won’t backfill as OSDs are near full and we can’t run the balancer as some placement groups are degraded. We upgraded Ceph from Luminous 12.2.12 to Nautilus 14.2.1 on a cluster used for backup services. We are in the process of migrating data (nearly complete), after which we’ll be able to repurpose the old systems as additional Ceph OSD nodes.
Our cluster was subsequently at about 75% utilisation and the balancer module together with upmap did a great job. We’ve historically been very conservative with placement group numbering, considering that smaller drives generally get replaced with much larger
ones and the PGs per OSD subsequently grow to problematic levels. The upgrade process was so extremely painless that we also enabled the pg_autoscaler module which subsequently marked 75% of the data as miss placed, but also degraded various placement groups. The result is now that we have many placement
groups marked as nearfull, but can’t run the balancer as some placement groups are in a degraded state. Is there a way we can override the degraded check and force the balancer to redistribute PGs; or could we manually adjust OSDs to have the same effect? Is there alternatively a way that we can get Ceph to first heal the degraded PGs and only then work on the miss placed ones? There are only 3 RBD images in this cluster, a 80GB operating system image in a replicated SSD pool, a 150TB erasure coded image and a relatively tiny replicated SSD caching tier for the EC pool. [admin@kvm7e ~]# ceph osd lspools 1 rbd_ssd 5 cephfs_data 6 cephfs_metadata 7 rbd_hdd 8 ec_hdd 9 rbd_hdd_cache 10 ec_hdd_cache [admin@kvm7e ~]# for f in `ceph osd lspools | cut -d\ -f2`; do ceph osd pool set $f pg_autoscale_mode on; done; set pool 1 pg_autoscale_mode to on set pool 5 pg_autoscale_mode to on set pool 6 pg_autoscale_mode to on set pool 7 pg_autoscale_mode to on set pool 8 pg_autoscale_mode to on set pool 9 pg_autoscale_mode to on set pool 10 pg_autoscale_mode to on Concerning was that Ceph marked OSDs are near full although this is by default only when an OSD reaches 85% utilisation. I presume Ceph projects the resulting storage utilisation based on the weighting set by the balancer? [admin@kvm7e ~]# ceph health detail HEALTH_ERR noout flag(s) set; 6 nearfull osd(s); 4 pool(s) nearfull; Reduced data availability: 2 pgs inactive; Degraded data redundancy (low space): 4 pgs backfill_toofull OSDMAP_FLAGS noout flag(s) set OSD_NEARFULL 6 nearfull osd(s) osd.100 is near full osd.101 is near full osd.102 is near full osd.103 is near full osd.104 is near full osd.105 is near full POOL_NEARFULL 4 pool(s) nearfull pool 'cephfs_data' is nearfull pool 'cephfs_metadata' is nearfull pool 'rbd_hdd' is nearfull pool 'ec_hdd' is nearfull PG_AVAILABILITY Reduced data availability: 2 pgs inactive pg 7.1e is stuck inactive for 437.102346, current state clean+premerge+peered, last acting [303,104,405] pg 7.3e is stuck inactive for 436.965670, current state remapped+premerge+backfill_wait+peered, last acting [405,104,301] PG_DEGRADED_FULL Degraded data redundancy (low space): 4 pgs backfill_toofull pg 8.c8 is active+remapped+backfill_wait+backfill_toofull, acting [305,104,404,504,203] pg 8.1bb is active+remapped+backfill_wait+backfill_toofull, acting [505,204,102,304,404] pg 8.326 is active+remapped+backfill_wait+backfill_toofull, acting [302,504,402,103,202] pg 8.3e0 is active+remapped+backfill_wait+backfill_toofull, acting [202,402,103,305,505] [admin@kvm7e ~]# ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 100 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 9.2 MiB 2.5 GiB 490 GiB 73.71 1.01 59 up 101 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 4.2 MiB 2.5 GiB 489 GiB 73.77 1.01 58 up 102 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 21 MiB 7.4 GiB 1.4 TiB 73.63 1.01 175 up 103 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 20 MiB 7.4 GiB 1.4 TiB 73.49 1.00 176 up 104 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 28 MiB 13 GiB 2.2 TiB 75.68 1.03 304 up 105 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 23 MiB 13 GiB 2.2 TiB 75.63 1.03 301 up 200 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 4.1 MiB 2.5 GiB 492 GiB 73.61 1.01 59 up 201 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 4.2 MiB 2.5 GiB 491 GiB 73.65 1.01 58 up 202 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 21 MiB 7.4 GiB 1.4 TiB 73.60 1.01 175 up 203 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 18 MiB 7.4 GiB 1.4 TiB 73.51 1.00 175 up 204 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 40 MiB 13 GiB 2.2 TiB 75.65 1.03 301 up 205 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 41 MiB 13 GiB 2.2 TiB 75.71 1.03 302 up 300 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 6.1 MiB 2.5 GiB 490 GiB 73.68 1.01 58 up 301 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 13 MiB 2.5 GiB 490 GiB 73.72 1.01 59 up 302 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 13 MiB 7.4 GiB 1.4 TiB 73.57 1.01 174 up 303 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 22 MiB 7.4 GiB 1.4 TiB 73.56 1.01 177 up 304 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 42 MiB 13 GiB 2.2 TiB 75.68 1.03 302 up 305 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 40 MiB 13 GiB 2.2 TiB 75.67 1.03 304 up 400 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 8.6 MiB 2.5 GiB 489 GiB 73.75 1.01 59 up 401 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 8.7 MiB 2.5 GiB 493 GiB 73.56 1.01 59 up 402 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 25 MiB 7.4 GiB 1.4 TiB 73.51 1.00 176 up 403 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 18 MiB 7.4 GiB 1.4 TiB 73.59 1.01 175 up 404 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 40 MiB 13 GiB 2.2 TiB 75.68 1.03 298 up 405 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 40 MiB 13 GiB 2.2 TiB 75.69 1.03 301 up 500 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 8.8 MiB 2.5 GiB 491 GiB 73.66 1.01 59 up 501 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 10 MiB 2.5 GiB 491 GiB 73.63 1.01 58 up 502 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 22 MiB 7.4 GiB 1.4 TiB 73.62 1.01 174 up 503 hdd 5.45789 1.00000 5.5 TiB 4.0 TiB 4.0 TiB 21 MiB 7.4 GiB 1.4 TiB 73.54 1.01 177 up 504 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 38 MiB 13 GiB 2.2 TiB 75.67 1.03 301 up 505 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 42 MiB 13 GiB 2.2 TiB 75.67 1.03 300 up 10100 ssd 0.40630 1.00000 416 GiB 40 GiB 39 GiB 28 MiB 996 MiB 376 GiB 9.60 0.13 35 up 10101 ssd 0.40630 1.00000 416 GiB 44 GiB 43 GiB 20 MiB 1004 MiB 372 GiB 10.57 0.14 37 up 10200 ssd 0.40630 1.00000 416 GiB 32 GiB 31 GiB 15 MiB 1009 MiB 384 GiB 7.80 0.11 36 up 10201 ssd 0.40630 1.00000 416 GiB 32 GiB 31 GiB 16 MiB 1008 MiB 384 GiB 7.67 0.10 37 up 10300 ssd 0.40630 1.00000 416 GiB 40 GiB 39 GiB 17 MiB 1007 MiB 376 GiB 9.71 0.13 36 up 10301 ssd 0.40630 1.00000 416 GiB 43 GiB 42 GiB 18 MiB 1006 MiB 373 GiB 10.31 0.14 37 up 10400 ssd 0.40630 1.00000 416 GiB 34 GiB 33 GiB 17 MiB 1007 MiB 382 GiB 8.14 0.11 35 up 10401 ssd 0.40630 1.00000 416 GiB 40 GiB 39 GiB 19 MiB 1005 MiB 376 GiB 9.64 0.13 36 up 10500 ssd 0.40630 1.00000 416 GiB 38 GiB 37 GiB 18 MiB 1006 MiB 378 GiB 9.11 0.12 37 up 10501 ssd 0.40630 1.00000 416 GiB 46 GiB 45 GiB 18 MiB 1006 MiB 370 GiB 10.98 0.15 37 up TOTAL 168 TiB 123 TiB 123 TiB 839 MiB 234 GiB 45 TiB 73.16 [admin@kvm7e ~]# for f in /var/run/ceph/ceph-osd.*.asok; do ceph --admin-daemon $f config show; done | grep 'full' "mon_cache_target_full_warn_ratio": "0.660000", "mon_osd_backfillfull_ratio": "0.900000", "mon_osd_full_ratio": "0.950000", "mon_osd_nearfull_ratio": "0.850000", "mon_osdmap_full_prune_enabled": "true", "mon_osdmap_full_prune_interval": "10", "mon_osdmap_full_prune_min": "10000", "mon_osdmap_full_prune_txsize": "100", "osd_debug_skip_full_check_in_backfill_reservation": "false", "osd_debug_skip_full_check_in_recovery": "false", "osd_failsafe_full_ratio": "0.970000", "osd_pool_default_cache_target_full_ratio": "0.800000", "paxos_stash_full_interval": "25", <snip> An hour later: [admin@kvm7e ~]# ceph health detail HEALTH_ERR noout flag(s) set; 6 nearfull osd(s); 4 pool(s) nearfull; Degraded data redundancy: 2891706/93291241 objects degraded (3.100%), 165 pgs degraded, 165 pgs undersized;
Degraded data redundancy (low space): 647 pgs backfill_toofull OSDMAP_FLAGS noout flag(s) set OSD_NEARFULL 6 nearfull osd(s) osd.100 is near full osd.101 is near full osd.102 is near full osd.103 is near full osd.104 is near full osd.105 is near full POOL_NEARFULL 4 pool(s) nearfull pool 'cephfs_data' is nearfull pool 'cephfs_metadata' is nearfull pool 'rbd_hdd' is nearfull pool 'ec_hdd' is nearfull PG_DEGRADED Degraded data redundancy: 2891706/93291241 objects degraded (3.100%), 165 pgs degraded, 165 pgs undersized pg 8.304 is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,503,304,402,203] pg 8.307 is stuck undersized for 7968.693589, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,205,503,2147483647,401] pg 8.30c is stuck undersized for 7968.739427, current state active+undersized+degraded+remapped+backfill_toofull, last acting [501,2147483647,202,304,402] pg 8.311 is stuck undersized for 7968.732720, current state active+undersized+degraded+remapped+backfill_toofull, last acting [405,2147483647,502,204,305] pg 8.314 is stuck undersized for 7968.732950, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,302,503,205,405] pg 8.319 is stuck undersized for 7968.716745, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,403,204,505,305] pg 8.31c is stuck undersized for 7968.713635, current state active+undersized+degraded+remapped+backfill_toofull, last acting [500,2147483647,402,305,204] pg 8.327 is stuck undersized for 7968.664546, current state active+undersized+degraded+remapped+backfill_toofull, last acting [202,305,504,2147483647,405] pg 8.32f is stuck undersized for 7968.682409, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,402,204,504,302] pg 8.332 is stuck undersized for 7968.732504, current state active+undersized+degraded+remapped+backfill_toofull, last acting [302,405,2147483647,205,502] pg 8.334 is stuck undersized for 7968.694182, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,205,404,302,503] pg 8.33c is stuck undersized for 7968.734577, current state active+undersized+degraded+remapped+backfill_toofull, last acting [302,505,405,2147483647,201] pg 8.33f is stuck undersized for 7968.552298, current state active+undersized+degraded+remapped+backfill_toofull, last acting [400,504,2147483647,304,204] pg 8.348 is stuck undersized for 7968.696137, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,2147483647,404,504,203] pg 8.34c is stuck undersized for 7968.768111, current state active+undersized+degraded+remapped+backfilling, last acting [504,204,2147483647,303,403] pg 8.350 is stuck undersized for 7968.734046, current state active+undersized+degraded+remapped+backfill_toofull, last acting [405,502,305,2147483647,205] pg 8.35a is stuck undersized for 7968.685123, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,201,404,504,304] pg 8.35c is stuck undersized for 7968.700947, current state active+undersized+degraded+remapped+backfill_toofull, last acting [404,2147483647,303,505,205] pg 8.35e is stuck undersized for 7968.683728, current state active+undersized+degraded+remapped+backfill_wait, last acting [402,2147483647,304,500,203] pg 8.361 is stuck undersized for 7968.798644, current state active+undersized+degraded+remapped+backfilling, last acting [505,404,304,2147483647,204] pg 8.365 is stuck undersized for 7968.731458, current state active+undersized+degraded+remapped+backfill_toofull, last acting [405,303,2147483647,205,504] pg 8.368 is stuck undersized for 7968.799312, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,304,404,502,205] pg 8.36b is stuck undersized for 7968.736514, current state active+undersized+degraded+remapped+backfill_wait, last acting [300,403,204,2147483647,505] pg 8.36f is stuck undersized for 7968.695546, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,503,2147483647,405,205] pg 8.373 is stuck undersized for 7968.717140, current state active+undersized+degraded+remapped+backfill_toofull, last acting [403,303,2147483647,202,502] pg 8.379 is stuck undersized for 7968.732125, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,405,205,302,504] pg 8.37c is stuck undersized for 7968.712063, current state active+undersized+degraded+remapped+backfill_toofull, last acting [401,500,205,2147483647,302] pg 8.37d is stuck undersized for 7968.740233, current state active+undersized+degraded+remapped+backfill_toofull, last acting [501,302,2147483647,205,404] pg 8.384 is stuck undersized for 7968.796821, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,503,304,402,203] pg 8.387 is stuck undersized for 7968.639604, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,205,503,2147483647,401] pg 8.392 is stuck undersized for 7968.771812, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,305,405,202,2147483647] pg 8.394 is stuck undersized for 7968.734314, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,302,503,205,405] pg 8.399 is stuck undersized for 7968.722090, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,403,204,505,305] pg 8.39c is stuck undersized for 7968.712907, current state active+undersized+degraded+remapped+backfill_toofull, last acting [500,2147483647,402,305,204] pg 8.39d is stuck undersized for 7968.725227, current state active+undersized+degraded+remapped+backfill_toofull, last acting [403,2147483647,502,305,203] pg 8.3a5 is stuck undersized for 7968.644387, current state active+undersized+degraded+remapped+backfill_toofull, last acting [204,2147483647,304,400,505] pg 8.3a7 is stuck undersized for 7968.668449, current state active+undersized+degraded+remapped+backfill_toofull, last acting [202,305,504,2147483647,405] pg 8.3af is stuck undersized for 7968.683191, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,402,204,504,302] pg 8.3b2 is stuck undersized for 7968.733004, current state active+undersized+degraded+remapped+backfill_toofull, last acting [302,405,2147483647,205,502] pg 8.3b4 is stuck undersized for 7968.694415, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,205,404,302,503] pg 8.3bc is stuck undersized for 7968.732598, current state active+undersized+degraded+remapped+backfill_toofull, last acting [302,505,405,2147483647,201] pg 8.3cc is stuck undersized for 7968.771592, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,204,2147483647,303,403] pg 8.3d0 is stuck undersized for 7968.733520, current state active+undersized+degraded+remapped+backfill_toofull, last acting [405,502,305,2147483647,205] pg 8.3de is stuck undersized for 7968.681841, current state active+undersized+degraded+remapped+backfill_toofull, last acting [402,2147483647,304,500,203] pg 8.3df is stuck undersized for 7968.726621, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,200,303,500,404] pg 8.3e1 is stuck undersized for 7968.796332, current state active+undersized+degraded+remapped+backfill_toofull, last acting [505,404,304,2147483647,204] pg 8.3ec is stuck undersized for 7968.776206, current state active+undersized+degraded+remapped+backfill_toofull, last acting [502,302,405,2147483647,204] pg 8.3ef is stuck undersized for 7968.690806, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,503,2147483647,405,205] pg 8.3f7 is stuck undersized for 7968.672395, current state active+undersized+degraded+remapped+backfill_toofull, last acting [205,505,404,305,2147483647] pg 8.3fc is stuck undersized for 7968.711543, current state active+undersized+degraded+remapped+backfill_toofull, last acting [401,500,205,2147483647,302] pg 8.3fd is stuck undersized for 7968.740233, current state active+undersized+degraded+remapped+backfill_toofull, last acting [501,302,2147483647,205,404] PG_DEGRADED_FULL Degraded data redundancy (low space): 647 pgs backfill_toofull pg 8.3bc is active+undersized+degraded+remapped+backfill_toofull, acting [302,505,405,2147483647,201] pg 8.3bd is active+remapped+backfill_wait+backfill_toofull, acting [202,402,105,503,304] pg 8.3be is active+remapped+backfill_toofull, acting [504,303,205,100,403] pg 8.3c0 is active+remapped+backfill_toofull, acting [402,304,503,105,205] pg 8.3c1 is active+remapped+backfill_toofull, acting [301,101,405,504,204] pg 8.3c2 is active+remapped+backfill_toofull, acting [104,204,305,503,405] pg 8.3c3 is active+remapped+backfill_toofull, acting [405,105,204,303,505] pg 8.3c4 is active+remapped+backfill_toofull, acting [504,305,101,403,201] pg 8.3c6 is active+remapped+backfill_toofull, acting [404,304,205,504,105] pg 8.3c8 is active+remapped+backfill_toofull, acting [305,104,404,504,203] pg 8.3c9 is active+remapped+backfill_toofull, acting [404,505,102,301,203] pg 8.3cb is active+remapped+backfill_wait+backfill_toofull, acting [105,200,402,505,304] pg 8.3cc is active+undersized+degraded+remapped+backfill_toofull, acting [504,204,2147483647,303,403] pg 8.3cd is active+remapped+backfill_wait+backfill_toofull, acting [105,305,403,504,205] pg 8.3cf is active+remapped+backfill_toofull, acting [502,400,304,105,202] pg 8.3d0 is active+undersized+degraded+remapped+backfill_toofull, acting [405,502,305,2147483647,205] pg 8.3d1 is active+remapped+backfill_toofull, acting [205,103,404,502,304] pg 8.3d2 is active+remapped+backfill_toofull, acting [202,505,304,403,103] pg 8.3d3 is active+remapped+backfill_toofull, acting [204,101,405,505,302] pg 8.3d4 is active+remapped+backfill_toofull, acting [503,305,404,100,205] pg 8.3d5 is active+remapped+backfill_wait+backfill_toofull, acting [105,503,203,401,304] pg 8.3d7 is active+remapped+backfill_toofull, acting [504,305,404,200,102] pg 8.3d9 is active+remapped+backfill_toofull, acting [202,302,402,105,504] pg 8.3da is active+remapped+backfill_toofull, acting [104,201,404,504,304] pg 8.3dc is active+remapped+backfill_wait+backfill_toofull, acting [404,104,303,505,205] pg 8.3dd is active+remapped+backfill_toofull, acting [404,204,505,302,105] pg 8.3de is active+undersized+degraded+remapped+backfill_toofull, acting [402,2147483647,304,500,203] pg 8.3df is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,200,303,500,404] pg 8.3e0 is active+remapped+backfill_toofull, acting [202,402,103,305,505] pg 8.3e1 is active+undersized+degraded+remapped+backfill_toofull, acting [505,404,304,2147483647,204] pg 8.3e2 is active+remapped+backfill_toofull, acting [304,102,505,401,203] pg 8.3e4 is active+remapped+backfill_toofull, acting [303,102,503,202,405] pg 8.3e5 is active+remapped+backfill_toofull, acting [405,303,104,205,504] pg 8.3e6 is active+remapped+backfill_wait+backfill_toofull, acting [504,105,204,404,305] pg 8.3e7 is active+remapped+backfill_toofull, acting [103,202,505,405,304] pg 8.3eb is active+remapped+backfill_toofull, acting [300,403,204,104,505] pg 8.3ec is active+undersized+degraded+remapped+backfill_toofull, acting [502,302,405,2147483647,204] pg 8.3ee is active+remapped+backfill_toofull, acting [403,300,204,503,100] pg 8.3ef is active+undersized+degraded+remapped+backfill_toofull, acting [305,503,2147483647,405,205] pg 8.3f0 is active+remapped+backfill_toofull, acting [102,203,500,304,403] pg 8.3f1 is active+remapped+backfill_toofull, acting [505,305,404,105,202] pg 8.3f2 is active+remapped+backfill_wait+backfill_toofull, acting [105,304,403,202,502] pg 8.3f4 is active+remapped+backfill_toofull, acting [205,505,102,405,303] pg 8.3f5 is active+remapped+backfill_toofull, acting [405,105,304,504,201] pg 8.3f6 is active+remapped+backfill_wait+backfill_toofull, acting [105,204,505,304,404] pg 8.3f7 is active+undersized+degraded+remapped+backfill_toofull, acting [205,505,404,305,2147483647] pg 8.3fb is active+remapped+backfill_toofull, acting [303,505,401,105,203] pg 8.3fc is active+undersized+degraded+remapped+backfill_toofull, acting [401,500,205,2147483647,302] pg 8.3fd is active+undersized+degraded+remapped+backfill_toofull, acting [501,302,2147483647,205,404] pg 8.3fe is active+remapped+backfill_toofull, acting [504,304,402,205,105] pg 8.3ff is active+remapped+backfill_toofull, acting [405,202,102,501,303] About a day after enabling the pg_autoscaler module: [admin@kvm7e ~]# ceph health detail HEALTH_ERR noout flag(s) set; 18 nearfull osd(s); 4 pool(s) nearfull; Degraded data redundancy: 4227162/93352306 objects degraded (4.528%), 250 pgs degraded, 253 pgs undersized;
Degraded data redundancy (low space): 559 pgs backfill_toofull OSDMAP_FLAGS noout flag(s) set OSD_NEARFULL 18 nearfull osd(s) osd.100 is near full osd.101 is near full osd.102 is near full osd.103 is near full osd.104 is near full osd.105 is near full osd.200 is near full osd.201 is near full osd.203 is near full osd.300 is near full osd.301 is near full osd.303 is near full osd.304 is near full osd.401 is near full osd.404 is near full osd.501 is near full osd.502 is near full osd.504 is near full POOL_NEARFULL 4 pool(s) nearfull pool 'cephfs_data' is nearfull pool 'cephfs_metadata' is nearfull pool 'rbd_hdd' is nearfull pool 'ec_hdd' is nearfull PG_DEGRADED Degraded data redundancy: 4227162/93352306 objects degraded (4.528%), 250 pgs degraded, 253 pgs undersized pg 8.35a is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,201,404,504,304] pg 8.35b is stuck undersized for 13233.292244, current state active+undersized+degraded+remapped+backfill_toofull, last acting [203,503,2147483647,305,105] pg 8.361 is stuck undersized for 67655.237839, current state active+undersized+remapped+backfill_toofull, last acting [505,404,304,2147483647,204] pg 8.362 is stuck undersized for 67682.460453, current state active+undersized+degraded+remapped+backfill_toofull, last acting [304,2147483647,505,401,203] pg 8.363 is stuck undersized for 67665.785277, current state active+undersized+degraded+remapped+backfill_toofull, last acting [404,203,301,504,2147483647] pg 8.365 is stuck undersized for 13232.214476, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,303,104,205,504] pg 8.366 is stuck undersized for 67665.803857, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,2147483647,204,404,305] pg 8.368 is stuck undersized for 67655.243869, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,304,404,502,205] pg 8.36d is stuck undersized for 67682.440479, current state active+undersized+degraded+remapped+backfill_toofull, last acting [404,2147483647,201,502,304] pg 8.36f is stuck undersized for 13232.182277, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,503,104,2147483647,205] pg 8.373 is stuck undersized for 67665.636152, current state active+undersized+degraded+remapped+backfill_toofull, last acting [403,303,2147483647,202,502] pg 8.374 is stuck undersized for 13232.196886, current state active+undersized+degraded+remapped+backfill_toofull, last acting [205,505,102,2147483647,303] pg 8.375 is stuck undersized for 13232.147722, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,105,304,504,201] pg 8.379 is stuck undersized for 13232.052714, current state active+undersized+degraded+remapped+backfill_toofull, last acting [104,2147483647,205,302,504] pg 8.37a is stuck undersized for 13233.378060, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,105,502,202,2147483647] pg 8.37d is stuck undersized for 67665.810745, current state active+undersized+degraded+remapped+backfill_toofull, last acting [501,302,2147483647,205,404] pg 8.381 is stuck undersized for 13233.351431, current state active+undersized+degraded+remapped+backfill_toofull, last acting [304,204,2147483647,101,503] pg 8.382 is stuck undersized for 13232.226588, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,205,2147483647,300,105] pg 8.387 is stuck undersized for 67665.791360, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,205,503,2147483647,401] pg 8.391 is stuck undersized for 13233.342426, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,104,502,204,305] pg 8.392 is stuck undersized for 13232.227756, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,305,2147483647,202,104] pg 8.396 is stuck undersized for 13233.333363, current state active+undersized+degraded+remapped+backfill_toofull, last acting [303,102,2147483647,504,204] pg 8.399 is stuck undersized for 67665.635750, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,403,204,505,305] pg 8.39c is stuck undersized for 67655.266511, current state active+undersized+degraded+remapped+backfill_toofull, last acting [500,2147483647,402,305,204] pg 8.39d is stuck undersized for 67655.215300, current state active+undersized+degraded+remapped+backfill_toofull, last acting [403,2147483647,502,305,203] pg 8.3a0 is stuck undersized for 13233.339729, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,104,304,501,204] pg 8.3a1 is stuck undersized for 13232.162171, current state active+undersized+degraded+remapped+backfill_toofull, last acting [105,505,205,303,2147483647] pg 8.3a4 is stuck undersized for 13233.379809, current state active+undersized+degraded+remapped+backfilling, last acting [2147483647,305,102,502,203] pg 8.3ad is stuck undersized for 13233.368722, current state active+undersized+degraded+remapped+backfill_toofull, last acting [301,202,501,2147483647,103] pg 8.3ae is stuck undersized for 67665.744856, current state active+undersized+degraded+remapped+backfill_toofull, last acting [304,503,404,200,2147483647] pg 8.3af is stuck undersized for 67665.769691, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,402,204,504,302] pg 8.3b4 is stuck undersized for 67682.382958, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,205,404,302,503] pg 8.3bc is stuck undersized for 13232.171438, current state active+undersized+degraded+remapped+backfilling, last acting [302,505,2147483647,104,201] pg 8.3c3 is stuck undersized for 13232.151774, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,105,204,303,505] pg 8.3c6 is stuck undersized for 67665.792220, current state active+undersized+degraded+remapped+backfill_toofull, last acting [404,304,205,504,2147483647] pg 8.3ca is stuck undersized for 67665.817464, current state active+undersized+degraded+remapped+backfill_toofull, last acting [503,2147483647,205,305,400] pg 8.3cc is stuck undersized for 67665.802241, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,204,2147483647,303,403] pg 8.3d3 is stuck undersized for 13233.266745, current state active+undersized+degraded+remapped+backfill_toofull, last acting [204,101,2147483647,505,302] pg 8.3de is stuck undersized for 67655.238851, current state active+undersized+degraded+remapped+backfill_toofull, last acting [402,2147483647,304,500,203] pg 8.3df is stuck undersized for 67682.379100, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,200,303,500,404] pg 8.3e1 is stuck undersized for 67655.229239, current state active+undersized+degraded+remapped+backfill_toofull, last acting [505,404,304,2147483647,204] pg 8.3e3 is stuck undersized for 67665.788025, current state active+undersized+degraded+remapped+backfill_toofull, last acting [404,203,301,504,2147483647] pg 8.3e5 is stuck undersized for 13233.333897, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,303,104,205,504] pg 8.3e7 is stuck undersized for 13233.318293, current state active+undersized+degraded+remapped+backfill_toofull, last acting [103,202,505,2147483647,304] pg 8.3ea is stuck undersized for 67682.450664, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,2147483647,203,403,504] pg 8.3ec is stuck undersized for 13232.182236, current state active+undersized+degraded+remapped+backfill_toofull, last acting [502,302,2147483647,104,204] pg 8.3f1 is stuck undersized for 67665.809335, current state active+undersized+degraded+remapped+backfill_toofull, last acting [505,305,404,2147483647,202] pg 8.3fa is stuck undersized for 13232.182880, current state active+undersized+degraded+remapped+backfill_toofull, last acting [305,105,502,202,2147483647] pg 8.3fc is stuck undersized for 67655.245817, current state active+undersized+degraded+remapped+backfill_toofull, last acting [401,500,205,2147483647,302] pg 8.3fe is stuck undersized for 67665.800943, current state active+undersized+degraded+remapped+backfill_toofull, last acting [504,304,402,205,2147483647] pg 8.3ff is stuck undersized for 13233.326383, current state active+undersized+degraded+remapped+backfill_toofull, last acting [2147483647,202,102,501,303] PG_DEGRADED_FULL Degraded data redundancy (low space): 559 pgs backfill_toofull pg 8.3af is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,402,204,504,302] pg 8.3b0 is active+remapped+backfill_toofull, acting [405,305,502,200,103] pg 8.3b1 is active+remapped+backfill_wait+backfill_toofull, acting [505,205,300,101,402] pg 8.3b4 is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,205,404,302,503] pg 8.3b5 is active+remapped+backfill_toofull, acting [204,505,100,403,304] pg 8.3b6 is active+remapped+backfill_toofull, acting [404,300,504,100,203] pg 8.3b7 is active+remapped+backfill_toofull, acting [204,105,504,303,401] pg 8.3b8 is active+remapped+backfill_toofull, acting [501,103,205,302,404] pg 8.3b9 is active+remapped+backfill_toofull, acting [502,301,402,103,204] pg 8.3ba is active+remapped+backfill_toofull, acting [303,102,203,403,505] pg 8.3bb is active+remapped+backfill_toofull, acting [505,204,102,304,404] pg 8.3bd is active+remapped+backfill_toofull, acting [202,402,105,503,304] pg 8.3be is active+remapped+backfill_toofull, acting [504,303,205,100,403] pg 8.3c3 is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,105,204,303,505] pg 8.3c4 is active+remapped+backfill_wait+backfill_toofull, acting [504,305,101,403,201] pg 8.3c6 is active+undersized+degraded+remapped+backfill_toofull, acting [404,304,205,504,2147483647] pg 8.3ca is active+undersized+degraded+remapped+backfill_toofull, acting [503,2147483647,205,305,400] pg 8.3cb is active+remapped+backfill_toofull, acting [105,200,402,505,304] pg 8.3cc is active+undersized+degraded+remapped+backfill_toofull, acting [504,204,2147483647,303,403] pg 8.3cd is active+remapped+backfill_toofull, acting [105,305,403,504,205] pg 8.3cf is active+remapped+backfill_toofull, acting [502,400,304,105,202] pg 8.3d0 is active+remapped+backfill_toofull, acting [405,502,305,104,205] pg 8.3d1 is active+remapped+backfill_toofull, acting [205,103,404,502,304] pg 8.3d3 is active+undersized+degraded+remapped+backfill_toofull, acting [204,101,2147483647,505,302] pg 8.3d4 is active+remapped+backfill_toofull, acting [503,305,404,100,205] pg 8.3d7 is active+remapped+backfill_toofull, acting [504,305,404,200,102] pg 8.3d8 is active+remapped+backfill_toofull, acting [403,202,102,303,500] pg 8.3da is active+remapped+backfill_wait+backfill_toofull, acting [104,201,404,504,304] pg 8.3dc is active+remapped+backfill_wait+backfill_toofull, acting [404,104,303,505,205] pg 8.3de is active+undersized+degraded+remapped+backfill_toofull, acting [402,2147483647,304,500,203] pg 8.3df is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,200,303,500,404] pg 8.3e0 is active+remapped+backfill_wait+backfill_toofull, acting [202,402,103,305,505] pg 8.3e1 is active+undersized+degraded+remapped+backfill_toofull, acting [505,404,304,2147483647,204] pg 8.3e2 is active+remapped+backfill_toofull, acting [304,102,505,401,203] pg 8.3e3 is active+undersized+degraded+remapped+backfill_toofull, acting [404,203,301,504,2147483647] pg 8.3e5 is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,303,104,205,504] pg 8.3e6 is active+remapped+backfill_toofull, acting [504,105,204,404,305] pg 8.3e7 is active+undersized+degraded+remapped+backfill_toofull, acting [103,202,505,2147483647,304] pg 8.3e9 is active+remapped+backfill_toofull, acting [504,103,305,403,203] pg 8.3ea is active+undersized+degraded+remapped+backfill_toofull, acting [305,2147483647,203,403,504] pg 8.3ec is active+undersized+degraded+remapped+backfill_toofull, acting [502,302,2147483647,104,204] pg 8.3ed is active+remapped+backfill_wait+backfill_toofull, acting [404,101,201,502,304] pg 8.3ee is active+remapped+backfill_toofull, acting [403,300,204,503,100] pg 8.3f0 is active+remapped+backfill_toofull, acting [102,203,500,304,403] pg 8.3f1 is active+undersized+degraded+remapped+backfill_toofull, acting [505,305,404,2147483647,202] pg 8.3f2 is active+remapped+backfill_toofull, acting [105,304,403,202,502] pg 8.3f4 is active+remapped+backfill_toofull, acting [205,505,102,405,303] pg 8.3fa is active+undersized+degraded+remapped+backfill_toofull, acting [305,105,502,202,2147483647] pg 8.3fc is active+undersized+degraded+remapped+backfill_toofull, acting [401,500,205,2147483647,302] pg 8.3fe is active+undersized+degraded+remapped+backfill_toofull, acting [504,304,402,205,2147483647] pg 8.3ff is active+undersized+degraded+remapped+backfill_toofull, acting [2147483647,202,102,501,303] Ceph OSD utilisation breakdown: [admin@kvm7e ~]# ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 100 hdd 1.81929 1.00000 1.8 TiB 1.6 TiB 1.6 TiB 2.2 MiB 3.5 GiB 232 GiB 87.55 1.13 68 up 101 hdd 1.81929 1.00000 1.8 TiB 1.6 TiB 1.6 TiB 2.8 MiB 3.4 GiB 255 GiB 86.29 1.12 64 up 102 hdd 5.45789 1.00000 5.5 TiB 4.8 TiB 4.8 TiB 6.6 MiB 9.4 GiB 696 GiB 87.54 1.13 187 up 103 hdd 5.45789 1.00000 5.5 TiB 4.8 TiB 4.8 TiB 6.4 MiB 9.4 GiB 654 GiB 88.29 1.14 191 up 104 hdd 9.09560 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 11 MiB 12 GiB 3.2 TiB 64.67 0.84 192 up 105 hdd 9.09560 1.00000 9.1 TiB 7.1 TiB 7.1 TiB 10 MiB 14 GiB 1.9 TiB 78.58 1.02 256 up 200 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 2.0 MiB 3.2 GiB 317 GiB 82.99 1.08 64 up 201 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 2.2 MiB 3.4 GiB 292 GiB 84.34 1.09 63 up 202 hdd 5.45789 1.00000 5.5 TiB 4.3 TiB 4.3 TiB 6.7 MiB 8.5 GiB 1.1 TiB 79.18 1.03 176 up 203 hdd 5.45789 1.00000 5.5 TiB 4.7 TiB 4.7 TiB 6.2 MiB 9.2 GiB 762 GiB 86.36 1.12 195 up 204 hdd 9.09560 1.00000 9.1 TiB 7.0 TiB 7.0 TiB 12 MiB 14 GiB 2.1 TiB 77.12 1.00 290 up 205 hdd 9.09560 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 11 MiB 13 GiB 2.4 TiB 73.89 0.96 280 up 300 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 2.0 MiB 3.3 GiB 283 GiB 84.80 1.10 65 up 301 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 2.8 MiB 3.4 GiB 298 GiB 84.00 1.09 64 up 302 hdd 5.45789 1.00000 5.5 TiB 3.8 TiB 3.8 TiB 6.5 MiB 7.6 GiB 1.7 TiB 69.08 0.90 154 up 303 hdd 5.45789 1.00000 5.5 TiB 4.4 TiB 4.4 TiB 6.9 MiB 8.7 GiB 1.0 TiB 80.97 1.05 182 up 304 hdd 9.09560 1.00000 9.1 TiB 7.5 TiB 7.5 TiB 12 MiB 14 GiB 1.6 TiB 82.39 1.07 311 up 305 hdd 9.09560 1.00000 9.1 TiB 7.1 TiB 7.0 TiB 11 MiB 14 GiB 2.0 TiB 77.54 1.01 295 up 400 hdd 1.81929 1.00000 1.8 TiB 1.2 TiB 1.2 TiB 2.3 MiB 2.9 GiB 596 GiB 68.02 0.88 53 up 401 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 3.1 MiB 3.4 GiB 292 GiB 84.33 1.09 63 up 402 hdd 5.45789 1.00000 5.5 TiB 4.2 TiB 4.2 TiB 7.6 MiB 8.3 GiB 1.2 TiB 77.33 1.00 171 up 403 hdd 5.45789 1.00000 5.5 TiB 4.1 TiB 4.1 TiB 6.3 MiB 8.1 GiB 1.4 TiB 74.97 0.97 175 up 404 hdd 9.09560 1.00000 9.1 TiB 7.9 TiB 7.8 TiB 11 MiB 15 GiB 1.2 TiB 86.38 1.12 321 up 405 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 11 MiB 14 GiB 2.2 TiB 76.26 0.99 144 up 500 hdd 1.81929 1.00000 1.8 TiB 1.3 TiB 1.3 TiB 2.3 MiB 3.0 GiB 516 GiB 72.32 0.94 53 up 501 hdd 1.81929 1.00000 1.8 TiB 1.5 TiB 1.5 TiB 2.7 MiB 3.4 GiB 308 GiB 83.46 1.08 63 up 502 hdd 5.45789 1.00000 5.5 TiB 4.7 TiB 4.6 TiB 5.3 MiB 9.2 GiB 820 GiB 85.32 1.11 192 up 503 hdd 5.45789 1.00000 5.5 TiB 3.9 TiB 3.9 TiB 6.4 MiB 7.8 GiB 1.5 TiB 71.61 0.93 165 up 504 hdd 9.09560 1.00000 9.1 TiB 7.5 TiB 7.5 TiB 12 MiB 14 GiB 1.6 TiB 82.31 1.07 311 up 505 hdd 9.09560 1.00000 9.1 TiB 6.9 TiB 6.9 TiB 11 MiB 14 GiB 2.2 TiB 75.78 0.98 284 up 10100 ssd 0.40630 1.00000 416 GiB 40 GiB 39 GiB 19 MiB 1005 MiB 376 GiB 9.70 0.13 35 up 10101 ssd 0.40630 1.00000 416 GiB 44 GiB 43 GiB 39 MiB 985 MiB 372 GiB 10.62 0.14 37 up 10200 ssd 0.40630 1.00000 416 GiB 33 GiB 32 GiB 17 MiB 1007 MiB 383 GiB 7.92 0.10 36 up 10201 ssd 0.40630 1.00000 416 GiB 32 GiB 31 GiB 33 MiB 991 MiB 384 GiB 7.67 0.10 37 up 10300 ssd 0.40630 1.00000 416 GiB 41 GiB 40 GiB 30 MiB 994 MiB 375 GiB 9.80 0.13 36 up 10301 ssd 0.40630 1.00000 416 GiB 43 GiB 42 GiB 37 MiB 987 MiB 373 GiB 10.37 0.13 37 up 10400 ssd 0.40630 1.00000 416 GiB 34 GiB 33 GiB 24 MiB 1000 MiB 382 GiB 8.27 0.11 35 up 10401 ssd 0.40630 1.00000 416 GiB 40 GiB 39 GiB 30 MiB 994 MiB 376 GiB 9.69 0.13 36 up 10500 ssd 0.40630 1.00000 416 GiB 38 GiB 37 GiB 30 MiB 994 MiB 378 GiB 9.15 0.12 37 up 10501 ssd 0.40630 1.00000 416 GiB 46 GiB 45 GiB 31 MiB 993 MiB 370 GiB 11.08 0.14 37 up TOTAL 168 TiB 129 TiB 129 TiB 493 MiB 267 GiB 38 TiB 77.15 MIN/MAX VAR: 0.10/1.14 STDDEV: 34.38 Regards David Herselman |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com