Dne 17. 08. 22 v 12:47 Heming Zhao napsal(a):
On Wed, Aug 17, 2022 at 11:46:16AM +0200, Zdenek Kabelac wrote:
Dne 17. 08. 22 v 10:43 Heming Zhao napsal(a):
On Wed, Aug 17, 2022 at 10:06:35AM +0200, Zdenek Kabelac wrote:
Dne 17. 08. 22 v 4:03 Heming Zhao napsal(a):
On Tue, Aug 16, 2022 at 12:26:51PM +0200, Zdenek Kabelac wrote:
Dne 16. 08. 22 v 12:08 Heming Zhao napsal(a):
Ooh, very sorry, the subject is wrong, not IO performance but cpu high load
is triggered by pvmove.
The machine connecting disks are more than 250. The VG has 103 PVs & 79 LVs.
# /sbin/vgs
VG #PV #LV #SN Attr VSize VFree
<vgname> 103 79 0 wz--n- 52t 17t
Ok - so main issue could be too many PVs with relatively high latency of
mpath devices (which could be all actually simulated easily in lvm2 test suite)
The load is generated by multipath. lvmpolld does the IN_CLOSE_WRITE action
which is the trigger.
I'll check lvmpolld whether it's using correct locking while checking for the
operational state - you may possibly extend checking interval of polling
(although that's where the mentioned patchset has been enhancing couple things)
If you have too many disks in VG (again unclear how many there are paths
and how many distinct PVs) - user may *significantly* reduce burden
associated with metadata updating by reducing number of 'actively'
maintained metadata areas in VG - so i.e. if you have 100PVs in VG - you may
keep metadata only on 5-10 PVs to have 'enough' duplicate copies of lvm2
metadata within VG (vgchange --metadaatacopies X) - clearly it depends on
the use case and how many PVs are added/removed from a VG over the
lifetime....
Thanks for the important info. I also found the related VG config from
/etc/lvm/backup/<vgname>, this file shows 'metadata_copies = 0'.
This should be another solution. But why not lvm2 takes this behavior by
default, or give a notification when pv number beyond a threshold when user
executing pvs/vgs/lvs or pvmove.
There are too many magic switch, users don't know how to adjust them for
better performance.
Problem is always the same - selecting right 'default' :) what suites to user
A is sometimes 'no go' for user B. So ATM it's more 'secure/safe' to keep
metadata with each PV - so when a PV is discovered it's known how the VG using
such PV looks like. When only fraction of PV have the info - VG is way more
fragile on damage when disks are lost i.e. there is no 'smart' mechanism to
pick disks in different racks....
So this option is there for administrators that are 'clever' enough to deal
with a new set of problems it may create for them.
Yes - lvm2 has lot of options - but that's what is usually necessary when we
want to be capable to provide optimal solution for really wide variety of
setups - so I think spending couple minutes on reading man pages pays off -
especially if you had to spend 'days' on build your disk racks ;)
And yes we may add few more hints - but then we are asked by 'second' group of
users ('skilled admins') - why do we print so many dumb messages every time
they do some simple operation :)
I'm busy with many bugs, still can't find a time slot to set up a env.
For this performance issue, it relates with mpath, I can't find a easy
way to set up a env. (I suspect it may trigger this issue by setting up
300 fake PVs without mpath, then do pvmove cmd.)
'Fragmented' LVs with small segment sizes my significantly raise the amount of
metadata updates needed during pvmove operation as each single LV segments
will be mirrored by individual mirror.
Zdenek
_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/