Re: Discussion: performance issue on event activation mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/3/21 6:02 AM, David Teigland wrote:
On Fri, Jul 02, 2021 at 09:22:03PM +0000, Martin Wilck wrote:
On Fr, 2021-07-02 at 16:09 -0500, David Teigland wrote:
On Sun, Jun 06, 2021 at 02:15:23PM +0800, heming.zhao@xxxxxxxx wrote:
dev_cache_scan //order: O(n^2)
  + _insert_dirs //O(n)
  | if obtain_device_list_from_udev() true
  |   _insert_udev_dir //O(n)
  |
  + dev_cache_index_devs //O(n)

I've been running some experiments and trying some patches to improve
this.  By setting obtain_device_list_from_udev=0, and using the
attached
patch to disable dev_cache_index_devs, the pvscan is much better.

systemctl status lvm2-pvscan appears to show that the pvscan command
itself runs for only 2-4 seconds, while the service as a whole takes
around 15 seconds.  See the 16 sec gap below from the end of pvscan
to the systemd Started message.  If that's accurate, the remaining
delay
would lie outside lvm.

Jul 02 15:27:57 localhost.localdomain systemd[1]: Starting LVM event
activation on device 253:1710...
Jul 02 15:28:00 localhost.localdomain lvm[65620]:   pvscan[65620] PV
/dev/mapper/mpathalz online, VG 1ed02c7d-0019-43c4-91b5-f220f3521ba9
is complete.
Jul 02 15:28:00 localhost.localdomain lvm[65620]:   pvscan[65620] VG
1ed02c7d-0019-43c4-91b5-f220f3521ba9 run autoactivation.
Jul 02 15:28:00 localhost.localdomain lvm[65620]:   1 logical
volume(s) in volume group "1ed02c7d-0019-43c4-91b5-f220f3521ba9" now
active

Printing this message is really the last thing that pvscan does?

I've not seen anything measurable after that message.  However, digging
through the command init/exit paths I did find libudev setup/destroy calls
that can also be skipped when the command is not accessing libudev info.
A quick check seemed to show some further improvement from dropping those.
(That will be part of the larger patch isolating libudev usage.)

I'm still seeing significant delay between the apparent command exit and
the systemd "Started" message, but this will require a little more work to
prove.


I applied the patch and got the very similar result.


```with patch
# systemd-analyze blame
         27.679s lvm2-pvscan@134:896.service
         27.664s lvm2-pvscan@134:800.service
         27.648s lvm2-pvscan@134:960.service
         27.645s lvm2-pvscan@132:816.service
         27.589s lvm2-pvscan@134:816.service
         27.582s lvm2-pvscan@133:992.service
         ... ...

# systemctl status lvm2-pvscan@134:896.service
... ...
Jul 03 19:43:02 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 134:896...
Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   pvscan[24817] PV /dev/sdalm online, VG vg_sdalm is complet>
Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   pvscan[24817] VG vg_sdalm run autoactivation.
Jul 03 19:43:03 sle15sp2-base40g lvm[24817]:   1 logical volume(s) in volume group "vg_sdalm" now active
Jul 03 19:43:30 sle15sp2-base40g systemd[1]: Started LVM event activation on device 134:896.
```

the 27.679s get from 19:43:30 minus 19:43:02.

and 27s is 10s quick than the lvm without patch (by today's test result)
```without patch
# systemd-analyze blame
         37.650s lvm2-pvscan@133:992.service
         37.650s lvm2-pvscan@133:1008.service
         37.649s lvm2-pvscan@133:896.service
         37.649s lvm2-pvscan@134:960.service
         37.612s lvm2-pvscan@133:880.service
         37.612s lvm2-pvscan@133:864.servic
         ... ...

# systemctl status lvm2-pvscan@133:992.service
... ...
Jul 03 19:31:28 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 133:992...
Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   pvscan[24243] PV /dev/sdalc online, VG vg_sdalc is complet>
Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   pvscan[24243] VG vg_sdalc run autoactivation.
Jul 03 19:31:30 sle15sp2-base40g lvm[24243]:   1 logical volume(s) in volume group "vg_sdalc" now active
Jul 03 19:32:05 sle15sp2-base40g systemd[1]: Started LVM event activation on device 133:992.
```

I added log in lvm2 code, the time of log "1 logical volume(s) in volume
group "vg_sdalm" now active" is the real pvscan finished time. Martin
comment in previous mail is may right, systemd is may busy and delayed to
detect/see the pvscan service completion.

Thanks,
heming


_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux