primary affinity can help with a single pool - with multiple pools with different r/w ratio it becomes messy since pa is per device - it could help more if it was per device/pool pair. Also it could be more useful if the values were not 0-1 but 0-replica_count, but this is a usability issue, not functional, it just makes the use more cumbersome. It was designed for a different purpose though so this is not the "right" solution, the right solution is primary balancer.
Regards,
Josh
On Wed, Oct 20, 2021 at 11:42 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote:
> Doesn't the existing mgr balancer already balance the PGs for each pool individually? So in your example, the PGs from the loaded pool will be balanced across all osds, as will the idle pool's PGs. So the net load is uniform, right?
If there’s a single CRUSH root and all pools share the same set of OSDs? I suspect that what he’s getting at is if pools use different sets of OSDs, or (eek) live on partly overlapping sets of OSDs.
> OTOH I could see a workload/capacity imbalance if there are mixed capacity but equal performance devices (e.g. a cluster with 50% 6TB HDDs and 50% 12TB HDDs).
> In that case we're probably better to treat the disks as uniform in size until the smaller osds fill up.
Primary affinity can help, with reads at least, but it’s a bit fussy.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx