Hi, Indeed, we reweight OSDs to balance them, just not very effectively on this particular cluster. But I'm curious if reweighting alone can fix this: if all of a host's OSDs are reweighted by, say 0.5, does that result in other hosts being selected? Or do we need to change the crush weights themselves to fix this kind of imbalance? -- Dan > On 22 Nov 2016, at 13:44, Bartłomiej Święcki <bartlomiej.swiecki@xxxxxxxxxxxx> wrote: > > Hi, > > We've observed very similar problems in our clusters, it requires a lot of careful reweight to keep OSDs more or less at the same usage level. > Because of that issue, we're currently trying to keep Racks as regular as possible. Hope the patch you mentioned will address this too. > > Regards, > Bartek > > > On 11/22/2016 01:33 PM, Dan Van Der Ster wrote: >> Hi, >> >> I have a couple questions about http://tracker.ceph.com/issues/15653 >> >> In the ticket Sage discusses small/big drives, and the small drives get more data than expected. >> >> But we observe this at the rack level: our cluster has four racks, with 7, 8, 8, 4 hosts respectively. The rack with 4 hosts is ~35% more full than the others. >> >> So AFAICT, because of #15653, CRUSH does not currently work well if you try to build a pool which is replicated rack/host-wise when your rack/hosts are not all ~identical in size. >> >> Are others noticing this pattern? >> Or are we unusual in that our clusters are not flat/uniform in structure? >> >> Cheers, Dan >> _______________________________________________ >> Ceph-large mailing list >> Ceph-large@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com > _______________________________________________ Ceph-large mailing list Ceph-large@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com