Re: Unbalanced data distribution

Thomas Schneider <74cmonty@xxxxxxxxx> · Wed, 23 Oct 2019 08:14:30 +0200

The number of PGs on the 7.2TB disks is 120 in avg., and the number of
PGs on the 1.6TB disks is 35 in avg.
This means a difference by factor 3-4.

However I don't understand why this should explain the unbalanced data
distribution on the 1.6TB disks only (the 7.2 TB disks are balanced)
And all the disks are defined to serve the same pool only by a suitable
Crush Map configuration. This means any other pool is served by
different disks.
Here's an example for one node, all other 6 nodes are similar:
host ld5505-hdd_strgbox { id -16 # do not change unnecessarily id -18
class hdd # do not change unnecessarily id -20 class nvme # do not
change unnecessarily id -49 class ssd # do not change unnecessarily #
weight 78.720 alg straw2 hash 0 # rjenkins1 item osd.76 weight 1.640
item osd.77 weight 1.640 item osd.78 weight 1.640 [...]  item osd.97
weight 1.640 item osd.102 weight 1.640 item osd.110 weight 1.640 }

In addition I don't understand why distributing the disks equally over
all nodes should solve the issue?
My understanding is that Ceph's algorithm should be smart enough to
determine which object should be placed where and ensure balanced
utilisation.
I agree that I have a major impact if a node with 7.2TB disks go down,
though.

Am 23.10.2019 um 06:59 schrieb Anthony D'Atri:
> I agree wrt making the nodes weights uniform.   
>
> When mixing drive sizes, be careful that the larger ones don’t run afoul of the pg max — they will receive more pgs than the smaller ones, and if you lose a node that might be enough to send some over the max.   ‘ceph OSD df’ and look at the PG counts.  
>
> This can also degrade performance since IO is not spread uniformly.   Primary affinity hops can mitigate somewhat.  
>
>> On Oct 22, 2019, at 8:26 PM, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:
>>
>> On 10/22/19 7:52 PM, Thomas wrote:
>>> Node 1
>>> 48x 1.6TB
>>> Node 2
>>> 48x 1.6TB
>>> Node 3
>>> 48x 1.6TB
>>> Node 4
>>> 48x 1.6TB
>>> Node 5
>>> 48x 7.2TB
>>> Node 6
>>> 48x 7.2TB
>>> Node 7
>>> 48x 7.2TB
>> I suggest to balance disks in hosts, e.g. ~ 28x1.6TB + 20x7.2TB per host.
>>
>>> Why is the data distribution on the 1.6TB disks unequal?
>>> How can I correct this?
>> Balancer in upmap mode works with pools. I guess some of your 1.6TB OSD's not serve some pools.
>>
>>
>>
>> k
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx