Re: Need help! Ceph backfill_toofull and recovery_wait+degraded

Ronny Aasen <ronny+ceph-users@xxxxxxxx> · Tue, 1 Nov 2016 20:27:24 +0100

if you have the default crushmap and osd pool default size = 3, then 
ceph creates 3 copies of each object. and store
it on 3 separate nodes.

so the best way to solve your space problems is to try to even out the 
space between your hosts. either by adding disks to ceph1 ceph2 ceph3, 
or by adding more nodes.

kind regards
Ronny Aasen

On 01.11.2016 20:14, Marcus Müller wrote:
> Hi all,
>
> i have a big problem and i really hope someone can help me!
>
> We are running a ceph cluster since a year now. Version is: 0.94.7 
(Hammer)
> Here is some info:
>
> Our osd map is:
>
> ID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 26.67998 root default
> -2  3.64000     host ceph1
>  0  3.64000         osd.0       up  1.00000          1.00000
> -3  3.50000     host ceph2
>  1  3.50000         osd.1       up  1.00000          1.00000
> -4  3.64000     host ceph3
>  2  3.64000         osd.2       up  1.00000          1.00000
> -5 15.89998     host ceph4
>  3  4.00000         osd.3       up  1.00000          1.00000
>  4  3.59999         osd.4       up  1.00000          1.00000
>  5  3.29999         osd.5       up  1.00000          1.00000
>  6  5.00000         osd.6       up  1.00000          1.00000
>
> ceph df:
>
> GLOBAL:
>     SIZE       AVAIL      RAW USED     %RAW USED
>     40972G     26821G       14151G         34.54
> POOLS:
>     NAME                ID     USED      %USED     MAX AVAIL     OBJECTS
>     blocks              7      4490G     10.96 1237G     7037004
>     commits             8       473M         0 1237G      802353
>     fs                  9      9666M      0.02 1237G     7863422
>
> ceph osd df:
>
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR
>  0 3.64000  1.00000  3724G  3128G   595G 84.01 2.43
>  1 3.50000  1.00000  3724G  3237G   487G 86.92 2.52
>  2 3.64000  1.00000  3724G  3180G   543G 85.41 2.47
>  3 4.00000  1.00000  7450G  1616G  5833G 21.70 0.63
>  4 3.59999  1.00000  7450G  1246G  6203G 16.74 0.48
>  5 3.29999  1.00000  7450G  1181G  6268G 15.86 0.46
>  6 5.00000  1.00000  7450G   560G  6889G  7.52 0.22
>               TOTAL 40972G 14151G 26820G 34.54
> MIN/MAX VAR: 0.22/2.52  STDDEV: 36.53
>
>
> Our current cluster state is:
>
>      health HEALTH_WARN
>             63 pgs backfill
>             8 pgs backfill_toofull
>             9 pgs backfilling
>             11 pgs degraded
>             1 pgs recovering
>             10 pgs recovery_wait
>             11 pgs stuck degraded
>             89 pgs stuck unclean
>             recovery 8237/52179437 objects degraded (0.016%)
>             recovery 9620295/52179437 objects misplaced (18.437%)
>             2 near full osd(s)
>             noout,noscrub,nodeep-scrub flag(s) set
>      monmap e8: 4 mons at 
{ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0}
>             election epoch 400, quorum 0,1,2,3 ceph1,ceph2,ceph3,ceph4
>      osdmap e1774: 7 osds: 7 up, 7 in; 84 remapped pgs
>             flags noout,noscrub,nodeep-scrub
>       pgmap v7316159: 320 pgs, 3 pools, 4501 GB data, 15336 kobjects
>             14152 GB used, 26820 GB / 40972 GB avail
>             8237/52179437 objects degraded (0.016%)
>             9620295/52179437 objects misplaced (18.437%)
>                  231 active+clean
>                   61 active+remapped+wait_backfill
>                    9 active+remapped+backfilling
>                    6 active+recovery_wait+degraded+remapped
>                    6 active+remapped+backfill_toofull
>                    4 active+recovery_wait+degraded
>                    2 active+remapped+wait_backfill+backfill_toofull
>                    1 active+recovering+degraded
> recovery io 11754 kB/s, 35 objects/s
>   client io 1748 kB/s rd, 249 kB/s wr, 44 op/s
>
>
> My main problems are:
>
> - As you can see from the osd tree, we have three separate hosts with 
only one osd each. Another one has four osds. Ceph allows me not to get 
data back from these three nodes with only one HDD, which are all near 
full. I tried to set the weight of the osds in the bigger node higher 
but this just does not work. So i added a new osd yesterday which made 
things not better, as you can see now. What do i have to do to just 
become these three nodes empty again and put more data on the other node 
with the four HDDs.
>
> - I added the „ceph4“ node later, this resulted in a strange ip 
change as you can see in the mon list. The public network and the 
cluster network were swapped or not assigned right. See ceph.conf
>
> [global]
> fsid = xxx
> mon_initial_members = ceph1
> mon_host = 192.168.10.3, 192.168.10.4, 192.168.10.5, 192.168.10.11
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
> public_network = 192.168.60.0/24
> cluster_network = 192.168.10.0/24
> osd pool default size = 3
> osd pool default min size = 1
> osd pool default pg num = 128
> osd pool default pgp num = 128
> osd recovery max active = 50
> osd recovery threads = 3
> mon_pg_warn_max_per_osd = 0
>
>   What can i do in this case (it’s no big problem since the network 
is 2x 10 GBE and everything works)?
>
> - One other thing. Even if i just prepare the osd, it’s automatically 
added to the cluster. I can not activate it. Has had someone other 
already such behavior?
>
> I’m now trying to delete something in the cluster, which already 
helped a bit:
>
>      health HEALTH_WARN
>             63 pgs backfill
>             8 pgs backfill_toofull
>             10 pgs backfilling
>             7 pgs degraded
>             3 pgs recovery_wait
>             7 pgs stuck degraded
>             82 pgs stuck unclean
>             recovery 6498/52085528 objects degraded (0.012%)
>             recovery 9507140/52085528 objects misplaced (18.253%)
>             2 near full osd(s)
>             noout,noscrub,nodeep-scrub flag(s) set
>      monmap e8: 4 mons at 
{ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0}
>             election epoch 400, quorum 0,1,2,3 ceph1,ceph2,ceph3,ceph4
>      osdmap e1780: 7 osds: 7 up, 7 in; 81 remapped pgs
>             flags noout,noscrub,nodeep-scrub
>       pgmap v7317114: 320 pgs, 3 pools, 4499 GB data, 15333 kobjects
>             14100 GB used, 26872 GB / 40972 GB avail
>             6498/52085528 objects degraded (0.012%)
>             9507140/52085528 objects misplaced (18.253%)
>                  238 active+clean
>                   60 active+remapped+wait_backfill
>                    7 active+remapped+backfilling
>                    6 active+remapped+backfill_toofull
>                    3 active+degraded+remapped+backfilling
>                    2 active+remapped+wait_backfill+backfill_toofull
>                    2 active+recovery_wait+degraded+remapped
>                    1 active+degraded+remapped+wait_backfill
>                    1 active+recovery_wait+degraded
> recovery io 7844 kB/s, 27 objects/s
>   client io 343 kB/s rd, 1 op/s
>
>
> If you need more information, just say it. I need really help!
>
> Thank you so far for reading!
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com