Re: Impact of large PG splits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you, Janne.
I believe the default 5% target_max_misplaced_ratio would work as well, we've had good experience with that in the past, without the autoscaler. I just haven't dealt with such large PGs, I've been warning them for two years (when the PGs were only almost half this size) and now they finally started to listen. Well, they would still ignore it if it wouldn't impact all kinds of things now. ;-)

Thanks,
Eugen

Zitat von Janne Johansson <icepic.dz@xxxxxxxxx>:

Den tis 9 apr. 2024 kl 10:39 skrev Eugen Block <eblock@xxxxxx>:
I'm trying to estimate the possible impact when large PGs are
splitted. Here's one example of such a PG:

PG_STAT OBJECTS BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG UP
86.3ff    277708  414403098409            0           0  3092
3092
[187,166,122,226,171,234,177,163,155,34,81,239,101,13,117,8,57,111]

If you ask for small increases of pg_num, it will only split that many
PGs at a time, so while there will be a lot of data movement, (50% due
to half of the data needs to go to another newly made PG, and on top
of that, PGs per OSD will change, but also the balancing can now work
better) it will not be affecting the whole cluster if you increase
with say, 8 pg_nums at a time. As per the other reply, if you bump the
number with a small amount - wait for HEALTH_OK - bump some more it
will take a lot of calendar time, but have rather small impact. My
view of it is basically that this will be far less impactful than if
you lose a whole OSD, and hopefully your cluster can survive this
event, so it should be able to handle a slow trickle of PG splits too.

You can set a target number for the pool and let the autoscaler run a
few splits at a time, there are some settings to look at on how
aggressive the autoscaler will be, so it doesn't have to be
manual/scripted, but it's not very hard to script it if you are unsure
about the amount of work the autoscaler will start at any given time.



--
May the most significant bit of your life be positive.


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux