The backfilling was caused by decommissioning an old host and moving a bunch of OSD to new machines. Balancer has not been activated since the backfill started / OSDs were moved around on hosts. Busy OSD level ? Do you mean fullness? The cluster is relatively unused in terms of business. # ceph status cluster: health: HEALTH_WARN noout flag(s) set Low space hindering backfill (add storage if this doesn't resolve itself): 10 pgs backfill_toofull services: mon: 4 daemons, quorum ceph-server-02,ceph-server-04,ceph-server-01,ceph-server-05 (age 6d) mgr: ceph-server-01.gfavjb(active, since 6d), standbys: ceph-server-05.swmxto, ceph-server-04.ymoarr, ceph-server-02.zzcppv mds: 1/1 daemons up, 3 standby osd: 44 osds: 44 up (since 6d), 44 in (since 6d); 19 remapped pgs flags noout data: volumes: 1/1 healthy pools: 9 pools, 481 pgs objects: 57.41M objects, 222 TiB usage: 351 TiB used, 129 TiB / 480 TiB avail pgs: 13895113/514097636 objects misplaced (2.703%) 455 active+clean 10 active+remapped+backfill_toofull 9 active+remapped+backfilling 5 active+clean+scrubbing+deep 2 active+clean+scrubbing io: client: 7.5 MiB/s rd, 4.8 KiB/s wr, 28 op/s rd, 1 op/s wr # ceph osd df | sort -rnk 17 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 9.09598 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 0 B 18 GiB 3.1 TiB 65.96 0.90 62 up 11 hdd 10.91423 1.00000 11 TiB 7.0 TiB 7.0 TiB 40 MiB 18 GiB 3.9 TiB 64.26 0.88 70 up 43 hdd 14.55269 1.00000 15 TiB 9.3 TiB 9.3 TiB 117 MiB 24 GiB 5.3 TiB 63.92 0.87 87 up 26 hdd 12.73340 1.00000 13 TiB 7.9 TiB 7.9 TiB 54 MiB 21 GiB 4.8 TiB 61.98 0.85 80 up 35 hdd 14.55269 1.00000 15 TiB 8.9 TiB 8.9 TiB 46 MiB 25 GiB 5.7 TiB 61.05 0.83 87 up 5 hdd 9.09569 1.00000 9.1 TiB 5.5 TiB 5.5 TiB 1 KiB 15 GiB 3.6 TiB 60.71 0.83 54 up TOTAL 480 TiB 351 TiB 350 TiB 2.6 GiB 1018 GiB 129 TiB 73.12 # ceph balancer status { "active": true, "last_optimize_duration": "0:00:00.000326", "last_optimize_started": "Wed Mar 27 09:04:32 2024", "mode": "upmap", "no_optimization_needed": false, "optimize_result": "Too many objects (0.027028 > 0.010000) are misplaced; try again later", "plans": [] } On Wed, Mar 27, 2024 at 4:53 PM David C. <david.casier@xxxxxxxx> wrote: > Hi Daniel, > > Changing pg_num when some OSD is almost full is not a good strategy (or > even dangerous). > > What is causing this backfilling? loss of an OSD? balancer? other ? > > What is the least busy OSD level (sort -nrk17) > > Is the balancer activated? (upmap?) > > Once the situation stabilizes, it becomes interesting to think about the > number of pg/osd => > https://docs.ceph.com/en/latest/rados/operations/placement-groups/#managing-pools-that-are-flagged-with-bulk > > > Le mer. 27 mars 2024 à 09:41, Daniel Williams <danielwoz@xxxxxxxxx> a > écrit : > >> Hey, >> >> I'm running ceph version 18.2.1 (reef) but this problem must have existed >> a >> long time before reef. >> >> The documentation says the autoscaler will target 100 pgs per OSD but I'm >> only seeing ~10. My erasure encoding is a stripe of 6 data 3 parity. >> Could that be the reason? PGs numbers for that EC pool are therefore >> multiplied by k+m by the autoscaler calculations? >> >> Is backfill_toofull calculated against the total size of the PG against >> every OSD it is destined for? For my case I have ~1TiB PGs because the >> autoscaler is creating only 10 per host, and then backfill too full is >> considering that one of my OSDs only has 500GiB free, although that >> doesn't >> quite add up either because two 1TiB PGs are backfilling two pg's that >> have >> OSD 1 in them. My backfill full ratio is set to 97%. >> >> Would it be correct for me to change the autoscaler to target ~700 pgs per >> osd and bias for storagefs and all EC pools to k+m? Should that be the >> default or the documentation recommended value? >> >> How scary is changing PG_NUM while backfilling misplaced PGs? It seems >> like >> there's a chance the backfill might succeed so I think I can wait. >> >> Any help is greatly appreciated, I've tried to include as much of the >> relevant debugging output as I can think of. >> >> Daniel >> >> # ceph osd ls | wc -l >> 44 >> # ceph pg ls | wc -l >> 484 >> >> # ceph osd pool autoscale-status >> POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO >> TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK >> .rgw.root 216.0k 3.0 480.2T 0.0000 >> 1.0 32 on False >> default.rgw.control 0 3.0 480.2T 0.0000 >> 1.0 32 on False >> default.rgw.meta 0 3.0 480.2T 0.0000 >> 1.0 32 on False >> default.rgw.log 1636k 3.0 480.2T 0.0000 >> 1.0 32 on False >> storagefs 233.5T 1.5 480.2T 0.7294 >> 1.0 256 on False >> storagefs-meta 850.2M 4.0 480.2T 0.0000 >> 4.0 32 on False >> storagefs_wide 355.3G 1.375 480.2T 0.0010 >> 1.0 32 on False >> .mgr 457.3M 3.0 480.2T 0.0000 >> 1.0 1 on False >> mgr-backup-2022-08-19 370.6M 3.0 480.2T 0.0000 >> 1.0 32 on False >> >> # ceph osd pool ls detail | column -t >> pool 15 '.rgw.root' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> pool 16 'default.rgw.control' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> pool 17 'default.rgw.meta' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> pool 18 'default.rgw.log' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> pool 36 'storagefs' erasure profile 6.3 size 9 >> min_size 7 crush_rule 2 object_hash rjenkins pg_num 256 >> pgp_num 256 autoscale_mode on >> pool 37 'storagefs-meta' replicated size 4 min_size 1 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> pool 45 'storagefs_wide' erasure profile 8.3 size 11 >> min_size 9 crush_rule 8 object_hash rjenkins pg_num 32 >> pgp_num 32 autoscale_mode on >> pool 46 '.mgr' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 >> autoscale_mode on >> pool 48 'mgr-backup-2022-08-19' replicated size 3 min_size 2 >> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 >> autoscale_mode on >> >> # ceph osd erasure-code-profile get 6.3 >> crush-device-class= >> crush-failure-domain=host >> crush-root=default >> jerasure-per-chunk-alignment=false >> k=6 >> m=3 >> plugin=jerasure >> technique=reed_sol_van >> w=8 >> >> # ceph pg ls | awk 'NR==1 || /backfill_toofull/' | awk '{print $1" "$2" >> "$4" "$6" "$11" "$15" "$16}' | column -t >> PG OBJECTS MISPLACED BYTES STATE >> UP ACTING >> 36.f 222077 141392 953817797727 active+remapped+backfill_toofull >> [1,27,41,8,36,17,14,40,32]p1 [33,32,29,23,16,17,28,1,14]p33 >> 36.5c 221761 147015 950692130045 active+remapped+backfill_toofull >> [26,27,40,29,1,37,39,11,42]p26 [12,24,4,2,31,25,17,33,8]p12 >> 36.60 222710 0 957109050809 active+remapped+backfill_toofull >> [41,34,22,3,1,35,9,39,29]p41 [2,34,22,3,27,32,28,24,1]p2 >> 36.6b 222202 427168 953843892012 active+remapped+backfill_toofull >> [20,15,7,21,37,1,38,17,32]p20 [7,2,32,26,5,35,24,17,23]p7 >> 36.74 222681 777546 957679960067 active+remapped+backfill_toofull >> [42,24,12,34,38,10,27,1,25]p42 [34,33,12,0,19,14,17,30,25]p34 >> 36.7b 222974 1560818 957691042940 active+remapped+backfill_toofull >> [2,35,27,1,20,18,19,12,8]p2 [31,23,21,24,35,18,19,33,25]p31 >> 36.82 222362 1998670 954507657022 active+remapped+backfill_toofull >> [37,22,1,38,11,23,27,32,33]p37 [27,33,0,32,5,25,20,13,15]p27 >> 36.b5 221676 1330056 953443725830 active+remapped+backfill_toofull >> [6,8,38,12,21,1,39,34,27]p6 [33,8,26,12,3,10,22,34,1]p33 >> 36.b6 222669 1335327 956973704883 active+remapped+backfill_toofull >> [11,13,41,4,12,34,29,6,1]p11 [2,29,34,4,12,9,15,6,28]p2 >> 36.e0 221518 1772144 952581426388 active+remapped+backfill_toofull >> [1,27,21,31,30,23,37,13,28]p1 [25,21,14,31,1,2,34,17,24]p25 >> >> ceph pg ls | awk 'NR==1 || /backfilling/' | grep -e BYTES -e '\[1' -e >> ',1,' >> -e '1\]' | awk '{print $1" "$2" "$4" "$6" "$11" "$15" "$16}' | column -t >> PG OBJECTS MISPLACED BYTES STATE UP >> ACTING >> 36.4a 221508 89144 951346455917 active+remapped+backfilling >> [40,43,33,32,30,38,22,35,9]p40 [27,10,20,7,30,21,1,28,31]p27 >> 36.79 222315 1111575 955797107713 active+remapped+backfilling >> [1,36,31,33,25,23,14,3,13]p1 [27,6,31,23,25,5,14,29,13]p27 >> 36.8d 222229 1284156 955234423342 active+remapped+backfilling >> [35,34,27,37,38,36,43,3,16]p35 [35,34,15,26,1,11,27,18,16]p35 >> 36.ba 222039 0 952547107971 active+remapped+backfilling >> [0,40,33,23,41,4,27,22,28]p0 [0,35,33,27,1,3,30,22,28]p0 >> 36.da 221607 277464 951599928383 active+remapped+backfilling >> [21,31,8,9,11,25,36,23,28]p21 [0,10,1,22,33,11,35,15,28]p0 >> 36.db 221685 58816 951420054091 active+remapped+backfilling >> [3,28,12,13,1,38,40,35,43]p3 [27,20,17,21,1,23,28,24,31]p27 >> >> # ceph osd df | sort -nk 17 | tail -n 5 >> 21 hdd 9.09598 1.00000 9.1 TiB 7.7 TiB 7.7 TiB 0 B 31 >> GiB >> 1.4 TiB 84.62 1.16 68 up >> 24 hdd 9.09598 1.00000 9.1 TiB 7.7 TiB 7.7 TiB 1 KiB 25 >> GiB >> 1.4 TiB 84.98 1.16 69 up >> 29 hdd 9.09569 1.00000 9.1 TiB 8.0 TiB 8.0 TiB 72 MiB 23 >> GiB >> 1.1 TiB 88.42 1.21 73 up >> 13 hdd 9.09569 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 1 KiB 22 >> GiB >> 1023 GiB 89.02 1.22 76 up >> 1 hdd 7.27698 1.00000 7.3 TiB 6.8 TiB 6.8 TiB 27 MiB 18 >> GiB >> 451 GiB 93.94 1.28 64 up >> >> # cat /etc/ceph/ceph.conf | grep full >> mon_osd_full_ratio = .98 >> mon_osd_nearfull_ratio = .96 >> mon_osd_backfillfull_ratio = .97 >> osd_backfill_full_ratio = .97 >> osd_failsafe_full_ratio = .99 >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx