Re: HEALTH_WARN - Recovery Stuck?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you increase the number of pgs, effectively each one is smaller so the backfill process may be able to ‘squeeze’ them onto the nearly full osds while it sorts things out. 

I’ve had something similar before and this def helped. 

Sent from my iPhone

On 12 Apr 2021, at 19:11, Marc <Marc@xxxxxxxxxxxxxxxxx> wrote:


You know you can play a bit with the ratios?

ceph tell osd.* injectargs '--mon_osd_full_ratio=0.950000'
ceph tell osd.* injectargs '--mon_osd_backfillfull_ratio=0.900000'


> -----Original Message-----
> From: Ml Ml <mliebherr99@xxxxxxxxxxxxxx>
> Sent: 12 April 2021 19:31
> To: ceph-users <ceph-users@xxxxxxx>
> Subject:  HEALTH_WARN - Recovery Stuck?
> 
> Hello,
> 
> i kind of ran out of disk space, so i added another host with osd.37.
> But it does not seem to move much data on it. (85MB in 2h)
> 
> Any idea why the recovery process seems to be stuck? Should i fix the
> 4 backfillfull osds first? (by changing the weight)?
> 
> root@ceph01:~# ceph -s
>  cluster:
>    id:     5436dd5d-83d4-4dc8-a93b-60ab5db145df
>    health: HEALTH_WARN
>            4 backfillfull osd(s)
>            9 nearfull osd(s)
>            Low space hindering backfill (add storage if this doesn't
> resolve itself): 1 pg backfill_toofull
>            4 pool(s) backfillfull
> 
>  services:
>    mon: 3 daemons, quorum ceph03,ceph01,ceph02 (age 12d)
>    mgr: ceph03(active, since 4M), standbys: ceph02.jwvivm
>    mds: backup:1 {0=backup.ceph06.hdjehi=up:active} 3 up:standby
>    osd: 53 osds: 53 up (since 2h), 53 in (since 2h); 235 remapped pgs
> 
>  task status:
>    scrub status:
>        mds.backup.ceph06.hdjehi: idle
> 
>  data:
>    pools:   4 pools, 1185 pgs
>    objects: 24.69M objects, 45 TiB
>    usage:   149 TiB used, 42 TiB / 191 TiB avail
>    pgs:     5388809/74059569 objects misplaced (7.276%)
>             950 active+clean
>             232 active+remapped+backfill_wait
>             2   active+remapped+backfilling
>             1   active+remapped+backfill_wait+backfill_toofull
> 
>  io:
>    recovery: 0 B/s, 171 keys/s, 16 objects/s
> 
>  progress:
>    Rebalancing after osd.37 marked in (2h)
>      [............................] (remaining: 6d)
> 
> 
> 
> root@ceph01:~# ceph health detail
> HEALTH_WARN 4 backfillfull osd(s); 9 nearfull osd(s); Low space
> hindering backfill (add storage if this doesn't resolve itself): 1 pg
> backfill_toofull; 4 pool(s) backfillfull
> [WRN] OSD_BACKFILLFULL: 4 backfillfull osd(s)
>    osd.28 is backfill full
>    osd.32 is backfill full
>    osd.66 is backfill full
>    osd.68 is backfill full
> [WRN] OSD_NEARFULL: 9 nearfull osd(s)
>    osd.11 is near full
>    osd.24 is near full
>    osd.27 is near full
>    osd.39 is near full
>    osd.40 is near full
>    osd.42 is near full
>    osd.43 is near full
>    osd.45 is near full
>    osd.69 is near full
> [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if
> this doesn't resolve itself): 1 pg backfill_toofull
>    pg 23.295 is active+remapped+backfill_wait+backfill_toofull,
> acting [8,67,32]
> [WRN] POOL_BACKFILLFULL: 4 pool(s) backfillfull
>    pool 'backurne-rbd' is backfillfull
>    pool 'device_health_metrics' is backfillfull
>    pool 'cephfs.backup.meta' is backfillfull
>    pool 'cephfs.backup.data' is backfillfull
> 
> 
> root@ceph01:~# ceph osd df tree
> ID   CLASS  WEIGHT     REWEIGHT  SIZE     RAW USE  DATA     OMAP
> META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME
> -1         182.59897         -  191 TiB  149 TiB  149 TiB    35 GiB
> 503 GiB   42 TiB  77.96  1.00    -          root default
> -2          24.62473         -   29 TiB   22 TiB   22 TiB   5.0 GiB
> 80 GiB  7.1 TiB  75.23  0.96    -              host ceph01
>  0    hdd    2.39999   1.00000  2.7 TiB  2.2 TiB  2.2 TiB   665 MiB
> 8.0 GiB  480 GiB  82.43  1.06   53      up          osd.0
>  1    hdd    2.29999   1.00000  2.7 TiB  2.1 TiB  2.1 TiB   446 MiB
> 7.5 GiB  590 GiB  78.44  1.01   49      up          osd.1
>  4    hdd    2.67029   0.91066  2.7 TiB  2.2 TiB  2.2 TiB   484 MiB
> 7.9 GiB  440 GiB  83.90  1.08   53      up          osd.4
>  8    hdd    2.39999   1.00000  2.7 TiB  2.1 TiB  2.1 TiB   490 MiB
> 7.9 GiB  533 GiB  80.49  1.03   51      up          osd.8
> 11    hdd    1.71660   1.00000  1.7 TiB  1.5 TiB  1.5 TiB   406 MiB
> 5.5 GiB  200 GiB  88.60  1.14   36      up          osd.11
> 12    hdd    1.29999   1.00000  2.7 TiB  1.2 TiB  1.2 TiB   366 MiB
> 4.9 GiB  1.5 TiB  43.89  0.56   28      up          osd.12
> 14    hdd    2.20000   1.00000  2.7 TiB  2.0 TiB  2.0 TiB   418 MiB
> 7.1 GiB  693 GiB  74.66  0.96   47      up          osd.14
> 18    hdd    2.20000   1.00000  2.7 TiB  2.0 TiB  1.9 TiB   434 MiB
> 7.3 GiB  737 GiB  73.05  0.94   47      up          osd.18
> 22    hdd    1.00000   1.00000  1.7 TiB  890 GiB  886 GiB   110 MiB
> 3.6 GiB  868 GiB  50.62  0.65   20      up          osd.22
> 30    hdd    1.50000   1.00000  1.7 TiB  1.4 TiB  1.3 TiB   361 MiB
> 4.9 GiB  370 GiB  78.93  1.01   32      up          osd.30
> 33    hdd    1.59999   0.97437  1.6 TiB  1.4 TiB  1.4 TiB   397 MiB
> 5.4 GiB  213 GiB  87.20  1.12   34      up          osd.33
> 64    hdd    3.33789   0.89752  3.3 TiB  2.7 TiB  2.7 TiB   573 MiB
> 9.9 GiB  647 GiB  81.07  1.04   64      up          osd.64
> -3          26.79504         -   30 TiB   24 TiB   24 TiB   6.2 GiB
> 89 GiB  5.4 TiB  81.80  1.05    -              host ceph02
>  2    hdd    1.50000   1.00000  1.7 TiB  1.4 TiB  1.4 TiB   363 MiB
> 5.3 GiB  359 GiB  79.58  1.02   32      up          osd.2
>  3    hdd    2.50000   1.00000  2.7 TiB  2.2 TiB  2.2 TiB   647 MiB
> 7.8 GiB  469 GiB  82.85  1.06   53      up          osd.3
>  7    hdd    2.00000   1.00000  2.7 TiB  1.8 TiB  1.8 TiB   453 MiB
> 7.0 GiB  848 GiB  69.00  0.89   43      up          osd.7
>  9    hdd    2.67029   0.98323  2.7 TiB  2.4 TiB  2.3 TiB   709 MiB
> 8.8 GiB  322 GiB  88.21  1.13   57      up          osd.9
> 13    hdd    1.79999   1.00000  2.4 TiB  1.7 TiB  1.6 TiB   410 MiB
> 6.5 GiB  747 GiB  69.41  0.89   40      up          osd.13
> 16    hdd    2.50000   1.00000  2.7 TiB  2.2 TiB  2.2 TiB   637 MiB
> 7.8 GiB  458 GiB  83.26  1.07   53      up          osd.16
> 19    hdd    1.39999   1.00000  1.7 TiB  1.3 TiB  1.3 TiB   345 MiB
> 5.1 GiB  465 GiB  73.53  0.94   30      up          osd.19
> 23    hdd    2.00000   1.00000  2.7 TiB  1.9 TiB  1.9 TiB   442 MiB
> 7.7 GiB  738 GiB  73.02  0.94   43      up          osd.23
> 24    hdd    1.71660   0.95634  1.7 TiB  1.5 TiB  1.5 TiB   426 MiB
> 5.8 GiB  187 GiB  89.37  1.15   36      up          osd.24
> 28    hdd    2.70000   1.00000  2.7 TiB  2.5 TiB  2.4 TiB   712 MiB
> 8.4 GiB  219 GiB  92.00  1.18   58      up          osd.28
> 31    hdd    2.67029   0.92993  2.7 TiB  2.3 TiB  2.3 TiB   465 MiB
> 8.1 GiB  393 GiB  85.62  1.10   54      up          osd.31
> 32    hdd    3.33789   1.00000  3.3 TiB  3.0 TiB  3.0 TiB   693 MiB
> 11 GiB  306 GiB  91.06  1.17   71      up          osd.32
> -4          24.52005         -   26 TiB   21 TiB   21 TiB   5.0 GiB
> 79 GiB  5.1 TiB  80.51  1.03    -              host ceph03
>  5    hdd    1.71660   1.00000  1.7 TiB  1.5 TiB  1.5 TiB   392 MiB
> 5.6 GiB  223 GiB  87.34  1.12   35      up          osd.5
>  6    hdd    1.71660   1.00000  1.7 TiB  1.5 TiB  1.5 TiB   397 MiB
> 5.6 GiB  221 GiB  87.41  1.12   35      up          osd.6
> 10    hdd    2.50000   0.97487  2.7 TiB  2.2 TiB  2.2 TiB   497 MiB
> 7.7 GiB  480 GiB  82.46  1.06   52      up          osd.10
> 15    hdd    2.29999   1.00000  2.7 TiB  2.1 TiB  2.1 TiB   474 MiB
> 7.6 GiB  586 GiB  78.57  1.01   49      up          osd.15
> 17    hdd    1.39999   1.00000  1.6 TiB  1.2 TiB  1.2 TiB   352 MiB
> 5.6 GiB  384 GiB  76.88  0.99   30      up          osd.17
> 20    hdd    1.59999   1.00000  1.7 TiB  1.4 TiB  1.4 TiB   234 MiB
> 5.4 GiB  331 GiB  81.15  1.04   33      up          osd.20
> 21    hdd    2.00000   1.00000  2.7 TiB  1.8 TiB  1.8 TiB   611 MiB
> 7.0 GiB  868 GiB  68.27  0.88   44      up          osd.21
> 25    hdd    1.70000   0.92348  1.7 TiB  1.4 TiB  1.4 TiB   407 MiB
> 5.6 GiB  274 GiB  84.41  1.08   35      up          osd.25
> 26    hdd    2.50000   1.00000  2.7 TiB  2.2 TiB  2.2 TiB   464 MiB
> 7.8 GiB  441 GiB  83.88  1.08   52      up          osd.26
> 27    hdd    2.70000   0.95955  2.7 TiB  2.4 TiB  2.4 TiB   674 MiB
> 8.3 GiB  318 GiB  88.35  1.13   57      up          osd.27
> 29    hdd    2.67029   0.73337  2.7 TiB  1.8 TiB  1.8 TiB   436 MiB
> 6.7 GiB  885 GiB  67.63  0.87   43      up          osd.29
> 63    hdd    1.71660   1.00000  1.7 TiB  1.5 TiB  1.5 TiB   226 MiB
> 5.7 GiB  224 GiB  87.26  1.12   35      up          osd.63
> -11          24.64297         -   25 TiB   21 TiB   21 TiB   4.9 GiB
> 66 GiB  3.4 TiB  86.48  1.11    -              host ceph04
> 34    hdd    5.24519   0.85004  5.2 TiB  4.0 TiB  4.0 TiB  1002 MiB
> 13 GiB  1.2 TiB  76.37  0.98   97      up          osd.34
> 42    hdd    5.24519   1.00000  5.2 TiB  4.7 TiB  4.7 TiB   1.1 GiB
> 15 GiB  545 GiB  89.86  1.15  113      up          osd.42
> 44    hdd    7.00000   1.00000  7.2 TiB  6.3 TiB  6.3 TiB   1.4 GiB
> 19 GiB  901 GiB  87.70  1.12  150      up          osd.44
> 45    hdd    7.15259   1.00000  7.2 TiB  6.5 TiB  6.4 TiB   1.5 GiB
> 19 GiB  718 GiB  90.20  1.16  154      up          osd.45
> -13          30.04085         -   30 TiB   26 TiB   26 TiB   5.8 GiB
> 81 GiB  4.2 TiB  86.11  1.10    -              host ceph05
> 39    hdd    7.15259   1.00000  7.2 TiB  6.4 TiB  6.4 TiB   1.5 GiB
> 19 GiB  751 GiB  89.74  1.15  153      up          osd.39
> 40    hdd    7.15259   1.00000  7.2 TiB  6.4 TiB  6.4 TiB   1.3 GiB
> 19 GiB  767 GiB  89.53  1.15  153      up          osd.40
> 41    hdd    7.15259   0.90002  7.2 TiB  5.8 TiB  5.7 TiB   1.2 GiB
> 18 GiB  1.4 TiB  80.54  1.03  138      up          osd.41
> 43    hdd    5.24519   1.00000  5.2 TiB  4.7 TiB  4.7 TiB   1.1 GiB
> 15 GiB  574 GiB  89.32  1.15  113      up          osd.43
> 60    hdd    3.33789   0.85780  3.3 TiB  2.6 TiB  2.6 TiB   685 MiB
> 8.9 GiB  754 GiB  77.93  1.00   62      up          osd.60
> -9          17.64297         -   18 TiB   13 TiB   13 TiB   3.0 GiB
> 43 GiB  4.4 TiB  74.85  0.96    -              host ceph06
> 35    hdd    7.15259   0.80005  7.2 TiB  5.2 TiB  5.2 TiB   1.0 GiB
> 16 GiB  2.0 TiB  72.31  0.93  124      up          osd.35
> 36    hdd    5.24519   0.85004  5.2 TiB  4.0 TiB  4.0 TiB   985 MiB
> 13 GiB  1.2 TiB  76.65  0.98   97      up          osd.36
> 38    hdd    5.24519   0.85004  5.2 TiB  4.0 TiB  4.0 TiB   1.0 GiB
> 13 GiB  1.2 TiB  76.50  0.98   97      up          osd.38
> -15          24.79565         -   25 TiB   22 TiB   22 TiB   4.7 GiB
> 66 GiB  3.1 TiB  87.64  1.12    -              host ceph07
> 66    hdd    7.15259   1.00000  7.2 TiB  6.5 TiB  6.5 TiB   1.5 GiB
> 19 GiB  670 GiB  90.86  1.17  155      up          osd.66
> 67    hdd    7.15259   0.91141  7.2 TiB  5.8 TiB  5.8 TiB   1.1 GiB
> 18 GiB  1.3 TiB  81.62  1.05  140      up          osd.67
> 68    hdd    3.33789   1.00000  3.3 TiB  3.0 TiB  3.0 TiB   738 MiB
> 9.8 GiB  299 GiB  91.24  1.17   71      up          osd.68
> 69    hdd    7.15259   1.00000  7.2 TiB  6.3 TiB  6.3 TiB   1.3 GiB
> 19 GiB  823 GiB  88.77  1.14  152      up          osd.69
> -17           9.53670         -  9.5 TiB  1.4 GiB   85 MiB   473 MiB
> 832 MiB  9.5 TiB   0.01     0    -              host ceph08
> 37    hdd    9.53670   1.00000  9.5 TiB  1.4 GiB   85 MiB   473 MiB
> 832 MiB  9.5 TiB   0.01     0    2      up          osd.37
>                          TOTAL  191 TiB  149 TiB  149 TiB    35 GiB
> 503 GiB   42 TiB  77.96
> MIN/MAX VAR: 0/1.18  STDDEV: 14.73
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux