I had a similar problem when upgraded to octopus and the solution is to turn off autobalancing. You can try to turn off if enabled ceph balancer off On Fri, Apr 24, 2020 at 8:51 AM Eugen Block <eblock@xxxxxx> wrote: > Hi, > the balancer is probably running, which mode? I changed the mode to > none in our own cluster because it also never finished rebalancing and > we didn’t have a bad pg distribution. Maybe it’s supposed to be like > that, I don’t know. > > Regards > Eugen > > > Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx>: > > > Hello, > > > > I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15 > > OSDs), on a 10G network. > > > > The cluster started small, and I’ve progressively added OSDs over > > time. Problem is…. The cluster never rebalances completely. There > > is always progress on backfilling, but PGs that used to be in > > active+clean state jump back into the active+remapped+backfilling > > (or active+remapped+backfill_wait) state, to be moved to different > > OSDs. > > > > Initially I had a 1G network (recently upgraded to 10G), and I was > > holding on the backfill settings (osd_max_backfills and > > osd_recovery_sleep_hdd). I just recently (last few weeks) upgraded > > to 10G, with osd_max_backfills = 50 and osd_recovery_sleep_hdd = 0 > > (only HDDs, no SSDs). Cluster has been backfilling for months now > > with no end in sight. > > > > Is this normal behavior? Is there any setting that I can look at > > that till give me an idea as to why PGs are jumping back into > > remapped from clean? > > > > Below is output of “ceph osd tree” and “ceph osd df”: > > > > # ceph osd tree > > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > > -1 203.72472 root default > > -9 40.01666 host vis-hsw-01 > > 3 hdd 10.91309 osd.3 up 1.00000 1.00000 > > 6 hdd 14.55179 osd.6 up 1.00000 1.00000 > > 10 hdd 14.55179 osd.10 up 1.00000 1.00000 > > -13 40.01666 host vis-hsw-02 > > 0 hdd 10.91309 osd.0 up 1.00000 1.00000 > > 7 hdd 14.55179 osd.7 up 1.00000 1.00000 > > 11 hdd 14.55179 osd.11 up 1.00000 1.00000 > > -11 40.01666 host vis-hsw-03 > > 4 hdd 10.91309 osd.4 up 1.00000 1.00000 > > 8 hdd 14.55179 osd.8 up 1.00000 1.00000 > > 12 hdd 14.55179 osd.12 up 1.00000 1.00000 > > -3 40.01666 host vis-hsw-04 > > 5 hdd 10.91309 osd.5 up 1.00000 1.00000 > > 9 hdd 14.55179 osd.9 up 1.00000 1.00000 > > 13 hdd 14.55179 osd.13 up 1.00000 1.00000 > > -15 43.65807 host vis-hsw-05 > > 1 hdd 14.55269 osd.1 up 1.00000 1.00000 > > 2 hdd 14.55269 osd.2 up 1.00000 1.00000 > > 14 hdd 14.55269 osd.14 up 1.00000 1.00000 > > -5 0 host vis-ivb-07 > > -7 0 host vis-ivb-10 > > # > > > > # ceph osd df > > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META > > AVAIL %USE VAR PGS STATUS > > 3 hdd 10.91309 1.00000 11 TiB 8.2 TiB 8.2 TiB 552 MiB 25 GiB > > 2.7 TiB 75.08 1.19 131 up > > 6 hdd 14.55179 1.00000 15 TiB 9.1 TiB 9.1 TiB 1.2 GiB 30 GiB > > 5.5 TiB 62.47 0.99 148 up > > 10 hdd 14.55179 1.00000 15 TiB 8.1 TiB 8.1 TiB 1.5 GiB 20 GiB > > 6.4 TiB 55.98 0.89 142 up > > 0 hdd 10.91309 1.00000 11 TiB 7.5 TiB 7.4 TiB 504 MiB 24 GiB > > 3.5 TiB 68.34 1.09 120 up > > 7 hdd 14.55179 1.00000 15 TiB 8.7 TiB 8.7 TiB 1.0 GiB 31 GiB > > 5.8 TiB 60.07 0.95 144 up > > 11 hdd 14.55179 1.00000 15 TiB 9.4 TiB 9.3 TiB 819 MiB 20 GiB > > 5.2 TiB 64.31 1.02 147 up > > 4 hdd 10.91309 1.00000 11 TiB 7.0 TiB 7.0 TiB 284 MiB 25 GiB > > 3.9 TiB 64.35 1.02 112 up > > 8 hdd 14.55179 1.00000 15 TiB 9.3 TiB 9.2 TiB 1.8 GiB 29 GiB > > 5.3 TiB 63.65 1.01 157 up > > 12 hdd 14.55179 1.00000 15 TiB 8.6 TiB 8.6 TiB 623 MiB 19 GiB > > 5.9 TiB 59.14 0.94 136 up > > 5 hdd 10.91309 1.00000 11 TiB 8.6 TiB 8.6 TiB 542 MiB 29 GiB > > 2.3 TiB 79.01 1.26 134 up > > 9 hdd 14.55179 1.00000 15 TiB 8.2 TiB 8.2 TiB 707 MiB 27 GiB > > 6.3 TiB 56.56 0.90 138 up > > 13 hdd 14.55179 1.00000 15 TiB 8.7 TiB 8.7 TiB 741 MiB 18 GiB > > 5.8 TiB 59.85 0.95 134 up > > 1 hdd 14.55269 1.00000 15 TiB 9.8 TiB 9.8 TiB 1.3 GiB 20 GiB > > 4.8 TiB 67.18 1.07 158 up > > 2 hdd 14.55269 1.00000 15 TiB 8.7 TiB 8.7 TiB 936 MiB 18 GiB > > 5.8 TiB 60.04 0.95 148 up > > 14 hdd 14.55269 1.00000 15 TiB 8.3 TiB 8.3 TiB 673 MiB 18 GiB > > 6.3 TiB 56.97 0.90 131 up > > TOTAL 204 TiB 128 TiB 128 TiB 13 GiB 350 GiB > > 75 TiB 62.95 > > MIN/MAX VAR: 0.89/1.26 STDDEV: 6.44 > > # > > > > > > Thank you! > > > > George > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx