lvmpolld causes high cpu load issue

Heming Zhao <heming.zhao@xxxxxxxx> · Tue, 16 Aug 2022 18:08:02 +0800

Ooh, very sorry, the subject is wrong, not IO performance but cpu high load
is triggered by pvmove.

On Tue, Aug 16, 2022 at 11:38:52AM +0200, Zdenek Kabelac wrote:
> Dne 16. 08. 22 v 11:28 Heming Zhao napsal(a):
> > Hello maintainers & list,
> > 
> > I bring a story:
> > One SUSE customer suffered lvmpolld issue, which cause IO performance dramatic
> > decrease.
> > 
> > How to trigger:
> > When machine connects large number of LUNs (eg 80~200), pvmove (eg, move a single
> > disk to a new one, cmd like: pvmove disk1 disk2), the system will suffer high
> > cpu load. But when system connects ~10 LUNs, the performance is fine.
> > 
> > We found two work arounds:
> > 1. set lvm.conf 'activation/polling_interval=120'.
> > 2. write a speical udev rule, which make udev ignore the event for mpath devices.
> >     echo 'ENV{DM_UUID}=="mpath-*", OPTIONS+="nowatch"' >\
> >      /etc/udev/rules.d/90-dm-watch.rules
> > 
> > Run above any one of two can make the performance issue disappear.
> > 
> > ** the root cause **
> > 
> > lvmpolld will do interval requeset info job for updating the pvmove status
> > 
> > On every polling_interval time, lvm2 will update vg metadata. The update job will
> > call sys_close, which will trigger systemd-udevd IN_CLOSE_WRITE event, eg:
> >    2022-<time>-xxx <hostname> systemd-udevd[pid]: dm-179: Inotify event: 8 for /dev/dm-179
> > (8 is IN_CLOSE_WRITE.)
> > 
> > These VGs underlying devices are multipath devices. So when lvm2 update metatdata,
> > even if pvmove write a few data, the sys_close action trigger udev's "watch"
> > mechanism to gets notified frequently about a process that has written to the
> > device and closed it. This causes frequent, pointless re-evaluation of the udev
> > rules for these devices.
> > 
> > My question: Does LVM2 maintainers have any idea to fix this bug?
> > 
> > In my view, does lvm2 could drop VGs devices fds until pvmove finish?
> 
> Hi
> 
> Please provide more info about lvm2  metadata and also some  'lvs -avvvvv'
> trace so we can get better picture about the layout - also version of
> lvm2,systemd,kernel in use.
> 
> pvmove is progressing by mirroring each segment of an LV - so if there would
> be a lot of segments - then each such update may trigger udev watch rule
> event.
> 
> But ATM I could hardly imagine how this could cause some 'dramatic'
> performance decrease -  maybe there is something wrong with udev rules on
> the system ?
> 
> What is the actual impact ?
> 
> Note - pvmove was never designed as a high performance operation (in fact it
> tries to not eat all the disk bandwidth as such)
> 
> Regards
> Zdenek

My mistake, I write here again:
The subject is wrong, not IO performance but cpu high load is triggered by pvmove.

There is no IO performance issue.

When system is connecting 80~200, the cpu load increase by 15~20, the
cpu usage by ~20%, which corresponds to about ~5,6 cores and led at
times to the cores fully utilized.
In another word: a single pvmove process cost 5-6 (sometime 10) cores
utilization. It's abnormal & unaccepted. 

The lvm2 is 2.03.05, kernel is 5.3. systemd is v246. 

BTW:
I change this mail subject from:  lvmpolld causes IO performance issue
to: lvmpolld causes high cpu load issue 
Please use this mail for later discussing.

- Heming

_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/