If you increase the number of pgs, effectively each one is smaller so the backfill process may be able to ‘squeeze’ them onto the nearly full osds while it sorts things out. I’ve had something similar before and this def helped. Sent from my iPhone On 12 Apr 2021, at 19:11, Marc <Marc@xxxxxxxxxxxxxxxxx> wrote: You know you can play a bit with the ratios? ceph tell osd.* injectargs '--mon_osd_full_ratio=0.950000' ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.900000' > -----Original Message----- > From: Ml Ml <mliebherr99@xxxxxxxxxxxxxx> > Sent: 12 April 2021 19:31 > To: ceph-users <ceph-users@xxxxxxx> > Subject: HEALTH_WARN - Recovery Stuck? > > Hello, > > i kind of ran out of disk space, so i added another host with osd.37. > But it does not seem to move much data on it. (85MB in 2h) > > Any idea why the recovery process seems to be stuck? Should i fix the > 4 backfillfull osds first? (by changing the weight)? > > root@ceph01:~# ceph -s > cluster: > id: 5436dd5d-83d4-4dc8-a93b-60ab5db145df > health: HEALTH_WARN > 4 backfillfull osd(s) > 9 nearfull osd(s) > Low space hindering backfill (add storage if this doesn't > resolve itself): 1 pg backfill_toofull > 4 pool(s) backfillfull > > services: > mon: 3 daemons, quorum ceph03,ceph01,ceph02 (age 12d) > mgr: ceph03(active, since 4M), standbys: ceph02.jwvivm > mds: backup:1 {0=backup.ceph06.hdjehi=up:active} 3 up:standby > osd: 53 osds: 53 up (since 2h), 53 in (since 2h); 235 remapped pgs > > task status: > scrub status: > mds.backup.ceph06.hdjehi: idle > > data: > pools: 4 pools, 1185 pgs > objects: 24.69M objects, 45 TiB > usage: 149 TiB used, 42 TiB / 191 TiB avail > pgs: 5388809/74059569 objects misplaced (7.276%) > 950 active+clean > 232 active+remapped+backfill_wait > 2 active+remapped+backfilling > 1 active+remapped+backfill_wait+backfill_toofull > > io: > recovery: 0 B/s, 171 keys/s, 16 objects/s > > progress: > Rebalancing after osd.37 marked in (2h) > [............................] (remaining: 6d) > > > > root@ceph01:~# ceph health detail > HEALTH_WARN 4 backfillfull osd(s); 9 nearfull osd(s); Low space > hindering backfill (add storage if this doesn't resolve itself): 1 pg > backfill_toofull; 4 pool(s) backfillfull > [WRN] OSD_BACKFILLFULL: 4 backfillfull osd(s) > osd.28 is backfill full > osd.32 is backfill full > osd.66 is backfill full > osd.68 is backfill full > [WRN] OSD_NEARFULL: 9 nearfull osd(s) > osd.11 is near full > osd.24 is near full > osd.27 is near full > osd.39 is near full > osd.40 is near full > osd.42 is near full > osd.43 is near full > osd.45 is near full > osd.69 is near full > [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if > this doesn't resolve itself): 1 pg backfill_toofull > pg 23.295 is active+remapped+backfill_wait+backfill_toofull, > acting [8,67,32] > [WRN] POOL_BACKFILLFULL: 4 pool(s) backfillfull > pool 'backurne-rbd' is backfillfull > pool 'device_health_metrics' is backfillfull > pool 'cephfs.backup.meta' is backfillfull > pool 'cephfs.backup.data' is backfillfull > > > root@ceph01:~# ceph osd df tree > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP > META AVAIL %USE VAR PGS STATUS TYPE NAME > -1 182.59897 - 191 TiB 149 TiB 149 TiB 35 GiB > 503 GiB 42 TiB 77.96 1.00 - root default > -2 24.62473 - 29 TiB 22 TiB 22 TiB 5.0 GiB > 80 GiB 7.1 TiB 75.23 0.96 - host ceph01 > 0 hdd 2.39999 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 665 MiB > 8.0 GiB 480 GiB 82.43 1.06 53 up osd.0 > 1 hdd 2.29999 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 446 MiB > 7.5 GiB 590 GiB 78.44 1.01 49 up osd.1 > 4 hdd 2.67029 0.91066 2.7 TiB 2.2 TiB 2.2 TiB 484 MiB > 7.9 GiB 440 GiB 83.90 1.08 53 up osd.4 > 8 hdd 2.39999 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 490 MiB > 7.9 GiB 533 GiB 80.49 1.03 51 up osd.8 > 11 hdd 1.71660 1.00000 1.7 TiB 1.5 TiB 1.5 TiB 406 MiB > 5.5 GiB 200 GiB 88.60 1.14 36 up osd.11 > 12 hdd 1.29999 1.00000 2.7 TiB 1.2 TiB 1.2 TiB 366 MiB > 4.9 GiB 1.5 TiB 43.89 0.56 28 up osd.12 > 14 hdd 2.20000 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 418 MiB > 7.1 GiB 693 GiB 74.66 0.96 47 up osd.14 > 18 hdd 2.20000 1.00000 2.7 TiB 2.0 TiB 1.9 TiB 434 MiB > 7.3 GiB 737 GiB 73.05 0.94 47 up osd.18 > 22 hdd 1.00000 1.00000 1.7 TiB 890 GiB 886 GiB 110 MiB > 3.6 GiB 868 GiB 50.62 0.65 20 up osd.22 > 30 hdd 1.50000 1.00000 1.7 TiB 1.4 TiB 1.3 TiB 361 MiB > 4.9 GiB 370 GiB 78.93 1.01 32 up osd.30 > 33 hdd 1.59999 0.97437 1.6 TiB 1.4 TiB 1.4 TiB 397 MiB > 5.4 GiB 213 GiB 87.20 1.12 34 up osd.33 > 64 hdd 3.33789 0.89752 3.3 TiB 2.7 TiB 2.7 TiB 573 MiB > 9.9 GiB 647 GiB 81.07 1.04 64 up osd.64 > -3 26.79504 - 30 TiB 24 TiB 24 TiB 6.2 GiB > 89 GiB 5.4 TiB 81.80 1.05 - host ceph02 > 2 hdd 1.50000 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 363 MiB > 5.3 GiB 359 GiB 79.58 1.02 32 up osd.2 > 3 hdd 2.50000 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 647 MiB > 7.8 GiB 469 GiB 82.85 1.06 53 up osd.3 > 7 hdd 2.00000 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 453 MiB > 7.0 GiB 848 GiB 69.00 0.89 43 up osd.7 > 9 hdd 2.67029 0.98323 2.7 TiB 2.4 TiB 2.3 TiB 709 MiB > 8.8 GiB 322 GiB 88.21 1.13 57 up osd.9 > 13 hdd 1.79999 1.00000 2.4 TiB 1.7 TiB 1.6 TiB 410 MiB > 6.5 GiB 747 GiB 69.41 0.89 40 up osd.13 > 16 hdd 2.50000 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 637 MiB > 7.8 GiB 458 GiB 83.26 1.07 53 up osd.16 > 19 hdd 1.39999 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 345 MiB > 5.1 GiB 465 GiB 73.53 0.94 30 up osd.19 > 23 hdd 2.00000 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 442 MiB > 7.7 GiB 738 GiB 73.02 0.94 43 up osd.23 > 24 hdd 1.71660 0.95634 1.7 TiB 1.5 TiB 1.5 TiB 426 MiB > 5.8 GiB 187 GiB 89.37 1.15 36 up osd.24 > 28 hdd 2.70000 1.00000 2.7 TiB 2.5 TiB 2.4 TiB 712 MiB > 8.4 GiB 219 GiB 92.00 1.18 58 up osd.28 > 31 hdd 2.67029 0.92993 2.7 TiB 2.3 TiB 2.3 TiB 465 MiB > 8.1 GiB 393 GiB 85.62 1.10 54 up osd.31 > 32 hdd 3.33789 1.00000 3.3 TiB 3.0 TiB 3.0 TiB 693 MiB > 11 GiB 306 GiB 91.06 1.17 71 up osd.32 > -4 24.52005 - 26 TiB 21 TiB 21 TiB 5.0 GiB > 79 GiB 5.1 TiB 80.51 1.03 - host ceph03 > 5 hdd 1.71660 1.00000 1.7 TiB 1.5 TiB 1.5 TiB 392 MiB > 5.6 GiB 223 GiB 87.34 1.12 35 up osd.5 > 6 hdd 1.71660 1.00000 1.7 TiB 1.5 TiB 1.5 TiB 397 MiB > 5.6 GiB 221 GiB 87.41 1.12 35 up osd.6 > 10 hdd 2.50000 0.97487 2.7 TiB 2.2 TiB 2.2 TiB 497 MiB > 7.7 GiB 480 GiB 82.46 1.06 52 up osd.10 > 15 hdd 2.29999 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 474 MiB > 7.6 GiB 586 GiB 78.57 1.01 49 up osd.15 > 17 hdd 1.39999 1.00000 1.6 TiB 1.2 TiB 1.2 TiB 352 MiB > 5.6 GiB 384 GiB 76.88 0.99 30 up osd.17 > 20 hdd 1.59999 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 234 MiB > 5.4 GiB 331 GiB 81.15 1.04 33 up osd.20 > 21 hdd 2.00000 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 611 MiB > 7.0 GiB 868 GiB 68.27 0.88 44 up osd.21 > 25 hdd 1.70000 0.92348 1.7 TiB 1.4 TiB 1.4 TiB 407 MiB > 5.6 GiB 274 GiB 84.41 1.08 35 up osd.25 > 26 hdd 2.50000 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 464 MiB > 7.8 GiB 441 GiB 83.88 1.08 52 up osd.26 > 27 hdd 2.70000 0.95955 2.7 TiB 2.4 TiB 2.4 TiB 674 MiB > 8.3 GiB 318 GiB 88.35 1.13 57 up osd.27 > 29 hdd 2.67029 0.73337 2.7 TiB 1.8 TiB 1.8 TiB 436 MiB > 6.7 GiB 885 GiB 67.63 0.87 43 up osd.29 > 63 hdd 1.71660 1.00000 1.7 TiB 1.5 TiB 1.5 TiB 226 MiB > 5.7 GiB 224 GiB 87.26 1.12 35 up osd.63 > -11 24.64297 - 25 TiB 21 TiB 21 TiB 4.9 GiB > 66 GiB 3.4 TiB 86.48 1.11 - host ceph04 > 34 hdd 5.24519 0.85004 5.2 TiB 4.0 TiB 4.0 TiB 1002 MiB > 13 GiB 1.2 TiB 76.37 0.98 97 up osd.34 > 42 hdd 5.24519 1.00000 5.2 TiB 4.7 TiB 4.7 TiB 1.1 GiB > 15 GiB 545 GiB 89.86 1.15 113 up osd.42 > 44 hdd 7.00000 1.00000 7.2 TiB 6.3 TiB 6.3 TiB 1.4 GiB > 19 GiB 901 GiB 87.70 1.12 150 up osd.44 > 45 hdd 7.15259 1.00000 7.2 TiB 6.5 TiB 6.4 TiB 1.5 GiB > 19 GiB 718 GiB 90.20 1.16 154 up osd.45 > -13 30.04085 - 30 TiB 26 TiB 26 TiB 5.8 GiB > 81 GiB 4.2 TiB 86.11 1.10 - host ceph05 > 39 hdd 7.15259 1.00000 7.2 TiB 6.4 TiB 6.4 TiB 1.5 GiB > 19 GiB 751 GiB 89.74 1.15 153 up osd.39 > 40 hdd 7.15259 1.00000 7.2 TiB 6.4 TiB 6.4 TiB 1.3 GiB > 19 GiB 767 GiB 89.53 1.15 153 up osd.40 > 41 hdd 7.15259 0.90002 7.2 TiB 5.8 TiB 5.7 TiB 1.2 GiB > 18 GiB 1.4 TiB 80.54 1.03 138 up osd.41 > 43 hdd 5.24519 1.00000 5.2 TiB 4.7 TiB 4.7 TiB 1.1 GiB > 15 GiB 574 GiB 89.32 1.15 113 up osd.43 > 60 hdd 3.33789 0.85780 3.3 TiB 2.6 TiB 2.6 TiB 685 MiB > 8.9 GiB 754 GiB 77.93 1.00 62 up osd.60 > -9 17.64297 - 18 TiB 13 TiB 13 TiB 3.0 GiB > 43 GiB 4.4 TiB 74.85 0.96 - host ceph06 > 35 hdd 7.15259 0.80005 7.2 TiB 5.2 TiB 5.2 TiB 1.0 GiB > 16 GiB 2.0 TiB 72.31 0.93 124 up osd.35 > 36 hdd 5.24519 0.85004 5.2 TiB 4.0 TiB 4.0 TiB 985 MiB > 13 GiB 1.2 TiB 76.65 0.98 97 up osd.36 > 38 hdd 5.24519 0.85004 5.2 TiB 4.0 TiB 4.0 TiB 1.0 GiB > 13 GiB 1.2 TiB 76.50 0.98 97 up osd.38 > -15 24.79565 - 25 TiB 22 TiB 22 TiB 4.7 GiB > 66 GiB 3.1 TiB 87.64 1.12 - host ceph07 > 66 hdd 7.15259 1.00000 7.2 TiB 6.5 TiB 6.5 TiB 1.5 GiB > 19 GiB 670 GiB 90.86 1.17 155 up osd.66 > 67 hdd 7.15259 0.91141 7.2 TiB 5.8 TiB 5.8 TiB 1.1 GiB > 18 GiB 1.3 TiB 81.62 1.05 140 up osd.67 > 68 hdd 3.33789 1.00000 3.3 TiB 3.0 TiB 3.0 TiB 738 MiB > 9.8 GiB 299 GiB 91.24 1.17 71 up osd.68 > 69 hdd 7.15259 1.00000 7.2 TiB 6.3 TiB 6.3 TiB 1.3 GiB > 19 GiB 823 GiB 88.77 1.14 152 up osd.69 > -17 9.53670 - 9.5 TiB 1.4 GiB 85 MiB 473 MiB > 832 MiB 9.5 TiB 0.01 0 - host ceph08 > 37 hdd 9.53670 1.00000 9.5 TiB 1.4 GiB 85 MiB 473 MiB > 832 MiB 9.5 TiB 0.01 0 2 up osd.37 > TOTAL 191 TiB 149 TiB 149 TiB 35 GiB > 503 GiB 42 TiB 77.96 > MIN/MAX VAR: 0/1.18 STDDEV: 14.73 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx