On 26-08-19 13:25, Simon Oosthoek wrote:
On 26-08-19 13:11, Wido den Hollander wrote:
<snip>
The reweight might actually cause even more confusion for the balancer.
The balancer uses upmap mode and that re-allocates PGs to different OSDs
if needed.
Looking at the output send earlier I have some replies. See below.
<snip>
Looking at this output the balancing seems OK, but from a different
perspective.
PGs are allocated to OSDs and not Objects nor data. All OSDs have 95~97
Placement Groups allocated.
That's good! A almost perfect distribution.
The problem that now rises is the difference in the size of these
Placement Groups as they hold different objects.
This is one of the side-effects of larger disks. The PGs on them will
grow and this will lead to inbalance between the OSDs.
I *think* that increasing the amount of PGs on this cluster would help,
but only for the pools which will contain most of the data.
This will consume a bit more CPU Power and Memory, but on modern systems
this should be less of a problem.
The good thing is that with Nautilus you can also scale down on the
amount of PGs if things would become a problem.
More PGs will mean smaller PGs and thus lead to a better data
distribution.
<snip>
That makes sense, dividing the data in smaller chunks makes it more
flexible. The osd nodes are quite underloaded, even with turbo recovery
mode on (10, not 32 ;-).
When the cluster is in HEALTH_OK again, I'll increase the PGs for the
cephfs pools...
On second thought, I reverted my reweight commands and adjusted the PGs,
which were quite low for some of the pools. The reason they were low is
that when we first created them, we expected them to be rarely used, but
then we started filling them just for the filling, and these are
probably the cause of the unbalance.
The cluster now has over 8% misplaced objects, so that can take a while...
Cheers
/Simon
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com