Re: lvmpolld causes high cpu load issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dne 17. 08. 22 v 14:39 Martin Wilck napsal(a):
On Wed, 2022-08-17 at 18:47 +0800, Heming Zhao wrote:
On Wed, Aug 17, 2022 at 11:46:16AM +0200, Zdenek Kabelac wrote:



ATM I'm not even sure if you are complaining about how CPU usage of
lvmpolld
or just huge udev rules processing overhead.

The load is generated by multipath. lvmpolld does the IN_CLOSE_WRITE
action
which is the trigger.

Let's be clear here: every close-after-write operation triggers udev's
"watch" mechanism for block devices, which causes the udev rules to be
executed for the device. That is not a cheap operation. In the case at
hand, the customer was observing a lot of "multipath -U" commands. So
apparently a significant part of the udev rule processing was spent in
"multipath -U". Running "multipath -U" is important, because the rule
could have been triggered by a change of the number of available paths
devices, and later commands run from udev rules might hang indefinitely
if the multipath device had no usable paths any more. "multipath -U" is
already quite well optimized, but it needs to do some I/O to complete
it's work, thus it takes a few milliseconds to run.

IOW, it would be misleading to point at multipath. close-after-write
operations on block devices should be avoided if possible. As you
probably know, the purpose udev's "watch" operation is to be able to
determine changes on layered devices, e.g. newly created LVs or the
like. "pvmove" is special, because by definition it will usually not
cause any changes in higher layers. Therefore it might make sense to
disable the udev watch on the affected PVs while pvmove is running, and
trigger a single change event (re-enabling the watch) after the pvmove
has finished. If that is impossible, lvmpolld and other lvm tools that
are involved in the pvmove operation should avoid calling close() on
the PVs, IOW keep the fds open until the operation is finished.

Hi

Let's make clear we are very well aware of all the constrains associated with udev rule logic (and we tried quite hard to minimize impact - however udevd developers kind of 'misunderstood' how badly they will be impacting system's performance with the existing watch rule logic - and the story kind of 'continues' with 'systemd's' & dBus services unfortunatelly...

However let's focus on 'pvmove' as it is potentially very lengthy operation - so it's not feasible to keep the VG locked/blocked across an operation which might take even days with slower storage and big moved sizes (write access/lock disables all readers...)

So the lvm2 does try to minimize locking time. We will re validate whether just necessary 'vg updating' operation are using 'write' access - since occasionally due to some unrelated code changes it might eventually result sometimes in unwanted 'write' VG open - but we can't keep the operation blocking a whole VG because of slow udev rule processing.

In normal circumstances udev rule should be processed very fast - unless there is something mis-designe causing a CPU overloading.

But as mentioned already few times - without more knowledge about the case we could hardly guess exact reasoning. But we already provided useful suggestion how to reduce number of 'processed' device by udev by reduction of 'lvm2 metadata PVs' - the big reason for frequent metadata upsate would be a big segmentation of LV - but this we will not know without seeing user's 'metadata' of a VG in this case...


Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux