There are several more fixes queued up for v12.2.12: 16b7cc1bf9 osd/OSDMap: add log for better debugging 3d2945dd6e osd/OSDMap: calc_pg_upmaps - restrict optimization to origin pools only ab2dbc2089 osd/OSDMap: drop local pool filter in calc_pg_upmaps 119d8cb2a1 crush: fix upmap overkill 0729a78877 osd/OSDMap: using std::vector::reserve to reduce memory reallocation f4f66e4f0a osd/OSDMap: more improvements to upmap 7bebc4cd28 osd/OSDMap: be more aggressive when trying to balance 1763a879e3 osd/OSDMap: potential access violation fix 8b3114ea62 osd/OSDMap: don't mapping all pgs each time in calc_pg_upmaps I haven't personally tried the newest of those yet because the balancer is working pretty well in our environment. Though one thing we definitely need to improve is the osd failure / upmap interplay. We currently lose all related upmaps when an osd is out -- this means that even though we re-use an osd-id we still need the balancer to work for awhile to restore the perfect balancing. If you have simple reproducers for your issues, please do create a tracker. -- Dan On Thu, Apr 4, 2019 at 1:21 PM Kári Bertilsson <karibertils@xxxxxxxxx> wrote: > > Yeah i agree... the auto balancer is definitely doing a poor job for me. > > I have been experimenting with this for weeks and i can make way better optimization than the balancer by looking at "ceph osd df tree" and manually running various ceph upmap commands. > > Too bad this is tedious work, and tends to get imbalanced again as soon as i need to replace disks. > > On Thu, Apr 4, 2019 at 10:49 AM Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: >> >> On Mon, 18 Mar 2019 at 16:42, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: >> > >> > The balancer optimizes # PGs / crush weight. That host looks already >> > quite balanced for that metric. >> > >> > If the balancing is not optimal for a specific pool that has most of >> > the data, then you can use the `optimize myplan <pool>` param. >> > >> >> From experimenting, in three different clusters, this is not quite right. >> >> I've found that the balancer is quite unable to optimize correctly if >> you have mixed sized OSDs, even if there's only one OSD that's bigger >> by 12 GBs. >> >> -- >> Iain Buclaw >> >> *(p < e ? p++ : p) = (c & 0x0f) + '0'; _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com