Re: jj's "improved" ceph balancer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Hi Josh,

That's another interesting dimension...
Indeed a cluster that has plenty of free capacity could indeed be balanced
by workload/iops, but once it reaches maybe 60 or 70% full, then I think
capacity would need to take priority.

But to be honest I don't really understand the workload/iops balancing
use-case. Can you describe some of the scenarios you have in mind?

.. Dan

On Wed, 20 Oct 2021, 20:45 Josh Salomon, <jsalomon@xxxxxxxxxx> wrote:

> Just another point of view:
> The current balancer balances the capacity but this is not enough. The
> balancer should also balance the workload and we plan on adding primary
> balancing for Quincy. In order to balance the workload you should work pool
> by pool because pools have different workloads. So while the observation
> about the +1 PGs is correct, I believe the correct solution should be
> talking this into consideration while still balancing capacity pool by pool.
> Capacity balancing is a functional requirement, while workload balancing
> is a performance requirement so it is important only for very loaded
> systems (loaded in terms of high IOPS not nearly full systems)
> I would appreciate comments on this thought.
> On Wed, 20 Oct 2021, 20:57 Dan van der Ster, <dan@xxxxxxxxxxxxxx> wrote:
>> Hi Jonas,
>> From your readme:
>> "the best possible solution is some OSDs having an offset of 1 PG to the
>> ideal count. As a PG-distribution-optimization is done per pool, without
>> checking other pool's distribution at all, some devices will be the +1 more
>> often than others. At worst one OSD is the +1 for each pool in the cluster."
>> That's an interesting observation/flaw which hadn't occurred to me
>> before. I think we don't ever see it in practice in our clusters because we
>> do not have multiple large pools on the same osds.
>> How large are the variances in your real clusters? I hope the example in
>> your readme isn't from real life??
>> Cheers, Dan
>> On Wed, 20 Oct 2021, 15:11 Jonas Jelten, <jelten@xxxxxxxxx> wrote:
>>> Hi!
>>> I've been working on this for quite some time now and I think it's ready
>>> for some broader testing and feedback.
>>> It's an alternative standalone balancer implementation, optimizing for
>>> equal OSD storage utilization and PG placement across all pools.
>>> It doesn't change your cluster in any way, it just prints the commands
>>> you can run to apply the PG movements.
>>> Please play around with it :)
>>> Quickstart example: generate 10 PG movements on hdd to stdout
>>>     ./ -v balance --max-pg-moves 10
>>> --only-crushclass hdd | tee /tmp/balance-upmaps
>>> When there's remapped pgs (e.g. by applying the above upmaps), you can
>>> inspect progress with:
>>>     ./ showremapped
>>>     ./ showremapped --by-osd
>>> And you can get a nice Pool and OSD usage overview:
>>>     ./ show --osds --per-pool-count
>>> --sort-utilization
>>> Of course there's many more features and optimizations to be added,
>>> but it served us very well in reclaiming terrabytes of until then
>>> unavailable storage already where the `mgr balancer` could no longer
>>> optimize.
>>> What do you think?
>>> Cheers
>>>   -- Jonas
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> Dev mailing list -- dev@xxxxxxx
>> To unsubscribe send an email to dev-leave@xxxxxxx
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux