Hi Daniel, Changing pg_num when some OSD is almost full is not a good strategy (or even dangerous). What is causing this backfilling? loss of an OSD? balancer? other ? What is the least busy OSD level (sort -nrk17) Is the balancer activated? (upmap?) Once the situation stabilizes, it becomes interesting to think about the number of pg/osd => https://docs.ceph.com/en/latest/rados/operations/placement-groups/#managing-pools-that-are-flagged-with-bulk Le mer. 27 mars 2024 à 09:41, Daniel Williams <danielwoz@xxxxxxxxx> a écrit : > Hey, > > I'm running ceph version 18.2.1 (reef) but this problem must have existed a > long time before reef. > > The documentation says the autoscaler will target 100 pgs per OSD but I'm > only seeing ~10. My erasure encoding is a stripe of 6 data 3 parity. > Could that be the reason? PGs numbers for that EC pool are therefore > multiplied by k+m by the autoscaler calculations? > > Is backfill_toofull calculated against the total size of the PG against > every OSD it is destined for? For my case I have ~1TiB PGs because the > autoscaler is creating only 10 per host, and then backfill too full is > considering that one of my OSDs only has 500GiB free, although that doesn't > quite add up either because two 1TiB PGs are backfilling two pg's that have > OSD 1 in them. My backfill full ratio is set to 97%. > > Would it be correct for me to change the autoscaler to target ~700 pgs per > osd and bias for storagefs and all EC pools to k+m? Should that be the > default or the documentation recommended value? > > How scary is changing PG_NUM while backfilling misplaced PGs? It seems like > there's a chance the backfill might succeed so I think I can wait. > > Any help is greatly appreciated, I've tried to include as much of the > relevant debugging output as I can think of. > > Daniel > > # ceph osd ls | wc -l > 44 > # ceph pg ls | wc -l > 484 > > # ceph osd pool autoscale-status > POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO > TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK > .rgw.root 216.0k 3.0 480.2T 0.0000 > 1.0 32 on False > default.rgw.control 0 3.0 480.2T 0.0000 > 1.0 32 on False > default.rgw.meta 0 3.0 480.2T 0.0000 > 1.0 32 on False > default.rgw.log 1636k 3.0 480.2T 0.0000 > 1.0 32 on False > storagefs 233.5T 1.5 480.2T 0.7294 > 1.0 256 on False > storagefs-meta 850.2M 4.0 480.2T 0.0000 > 4.0 32 on False > storagefs_wide 355.3G 1.375 480.2T 0.0010 > 1.0 32 on False > .mgr 457.3M 3.0 480.2T 0.0000 > 1.0 1 on False > mgr-backup-2022-08-19 370.6M 3.0 480.2T 0.0000 > 1.0 32 on False > > # ceph osd pool ls detail | column -t > pool 15 '.rgw.root' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > pool 16 'default.rgw.control' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > pool 17 'default.rgw.meta' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > pool 18 'default.rgw.log' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > pool 36 'storagefs' erasure profile 6.3 size 9 > min_size 7 crush_rule 2 object_hash rjenkins pg_num 256 > pgp_num 256 autoscale_mode on > pool 37 'storagefs-meta' replicated size 4 min_size 1 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > pool 45 'storagefs_wide' erasure profile 8.3 size 11 > min_size 9 crush_rule 8 object_hash rjenkins pg_num 32 > pgp_num 32 autoscale_mode on > pool 46 '.mgr' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 > autoscale_mode on > pool 48 'mgr-backup-2022-08-19' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 > autoscale_mode on > > # ceph osd erasure-code-profile get 6.3 > crush-device-class= > crush-failure-domain=host > crush-root=default > jerasure-per-chunk-alignment=false > k=6 > m=3 > plugin=jerasure > technique=reed_sol_van > w=8 > > # ceph pg ls | awk 'NR==1 || /backfill_toofull/' | awk '{print $1" "$2" > "$4" "$6" "$11" "$15" "$16}' | column -t > PG OBJECTS MISPLACED BYTES STATE > UP ACTING > 36.f 222077 141392 953817797727 active+remapped+backfill_toofull > [1,27,41,8,36,17,14,40,32]p1 [33,32,29,23,16,17,28,1,14]p33 > 36.5c 221761 147015 950692130045 active+remapped+backfill_toofull > [26,27,40,29,1,37,39,11,42]p26 [12,24,4,2,31,25,17,33,8]p12 > 36.60 222710 0 957109050809 active+remapped+backfill_toofull > [41,34,22,3,1,35,9,39,29]p41 [2,34,22,3,27,32,28,24,1]p2 > 36.6b 222202 427168 953843892012 active+remapped+backfill_toofull > [20,15,7,21,37,1,38,17,32]p20 [7,2,32,26,5,35,24,17,23]p7 > 36.74 222681 777546 957679960067 active+remapped+backfill_toofull > [42,24,12,34,38,10,27,1,25]p42 [34,33,12,0,19,14,17,30,25]p34 > 36.7b 222974 1560818 957691042940 active+remapped+backfill_toofull > [2,35,27,1,20,18,19,12,8]p2 [31,23,21,24,35,18,19,33,25]p31 > 36.82 222362 1998670 954507657022 active+remapped+backfill_toofull > [37,22,1,38,11,23,27,32,33]p37 [27,33,0,32,5,25,20,13,15]p27 > 36.b5 221676 1330056 953443725830 active+remapped+backfill_toofull > [6,8,38,12,21,1,39,34,27]p6 [33,8,26,12,3,10,22,34,1]p33 > 36.b6 222669 1335327 956973704883 active+remapped+backfill_toofull > [11,13,41,4,12,34,29,6,1]p11 [2,29,34,4,12,9,15,6,28]p2 > 36.e0 221518 1772144 952581426388 active+remapped+backfill_toofull > [1,27,21,31,30,23,37,13,28]p1 [25,21,14,31,1,2,34,17,24]p25 > > ceph pg ls | awk 'NR==1 || /backfilling/' | grep -e BYTES -e '\[1' -e ',1,' > -e '1\]' | awk '{print $1" "$2" "$4" "$6" "$11" "$15" "$16}' | column -t > PG OBJECTS MISPLACED BYTES STATE UP > ACTING > 36.4a 221508 89144 951346455917 active+remapped+backfilling > [40,43,33,32,30,38,22,35,9]p40 [27,10,20,7,30,21,1,28,31]p27 > 36.79 222315 1111575 955797107713 active+remapped+backfilling > [1,36,31,33,25,23,14,3,13]p1 [27,6,31,23,25,5,14,29,13]p27 > 36.8d 222229 1284156 955234423342 active+remapped+backfilling > [35,34,27,37,38,36,43,3,16]p35 [35,34,15,26,1,11,27,18,16]p35 > 36.ba 222039 0 952547107971 active+remapped+backfilling > [0,40,33,23,41,4,27,22,28]p0 [0,35,33,27,1,3,30,22,28]p0 > 36.da 221607 277464 951599928383 active+remapped+backfilling > [21,31,8,9,11,25,36,23,28]p21 [0,10,1,22,33,11,35,15,28]p0 > 36.db 221685 58816 951420054091 active+remapped+backfilling > [3,28,12,13,1,38,40,35,43]p3 [27,20,17,21,1,23,28,24,31]p27 > > # ceph osd df | sort -nk 17 | tail -n 5 > 21 hdd 9.09598 1.00000 9.1 TiB 7.7 TiB 7.7 TiB 0 B 31 GiB > 1.4 TiB 84.62 1.16 68 up > 24 hdd 9.09598 1.00000 9.1 TiB 7.7 TiB 7.7 TiB 1 KiB 25 GiB > 1.4 TiB 84.98 1.16 69 up > 29 hdd 9.09569 1.00000 9.1 TiB 8.0 TiB 8.0 TiB 72 MiB 23 GiB > 1.1 TiB 88.42 1.21 73 up > 13 hdd 9.09569 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 1 KiB 22 GiB > 1023 GiB 89.02 1.22 76 up > 1 hdd 7.27698 1.00000 7.3 TiB 6.8 TiB 6.8 TiB 27 MiB 18 GiB > 451 GiB 93.94 1.28 64 up > > # cat /etc/ceph/ceph.conf | grep full > mon_osd_full_ratio = .98 > mon_osd_nearfull_ratio = .96 > mon_osd_backfillfull_ratio = .97 > osd_backfill_full_ratio = .97 > osd_failsafe_full_ratio = .99 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx