I wonder if using a shell wrapper to detect whether pvscan actually exits way earlier than the unit does is a more accurate approach. On Sat, 3 Jul 2021 at 19:50, heming.zhao@xxxxxxxx <heming.zhao@xxxxxxxx> wrote: > > On 7/3/21 6:02 AM, David Teigland wrote: > > On Fri, Jul 02, 2021 at 09:22:03PM +0000, Martin Wilck wrote: > >> On Fr, 2021-07-02 at 16:09 -0500, David Teigland wrote: > >>> On Sun, Jun 06, 2021 at 02:15:23PM +0800, heming.zhao@xxxxxxxx wrote: > >>>> dev_cache_scan //order: O(n^2) > >>>> + _insert_dirs //O(n) > >>>> | if obtain_device_list_from_udev() true > >>>> | _insert_udev_dir //O(n) > >>>> | > >>>> + dev_cache_index_devs //O(n) > >>> > >>> I've been running some experiments and trying some patches to improve > >>> this. By setting obtain_device_list_from_udev=0, and using the > >>> attached > >>> patch to disable dev_cache_index_devs, the pvscan is much better. > >>> > >>> systemctl status lvm2-pvscan appears to show that the pvscan command > >>> itself runs for only 2-4 seconds, while the service as a whole takes > >>> around 15 seconds. See the 16 sec gap below from the end of pvscan > >>> to the systemd Started message. If that's accurate, the remaining > >>> delay > >>> would lie outside lvm. > >>> > >>> Jul 02 15:27:57 localhost.localdomain systemd[1]: Starting LVM event > >>> activation on device 253:1710... > >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]: pvscan[65620] PV > >>> /dev/mapper/mpathalz online, VG 1ed02c7d-0019-43c4-91b5-f220f3521ba9 > >>> is complete. > >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]: pvscan[65620] VG > >>> 1ed02c7d-0019-43c4-91b5-f220f3521ba9 run autoactivation. > >>> Jul 02 15:28:00 localhost.localdomain lvm[65620]: 1 logical > >>> volume(s) in volume group "1ed02c7d-0019-43c4-91b5-f220f3521ba9" now > >>> active > >> > >> Printing this message is really the last thing that pvscan does? > > > > I've not seen anything measurable after that message. However, digging > > through the command init/exit paths I did find libudev setup/destroy calls > > that can also be skipped when the command is not accessing libudev info. > > A quick check seemed to show some further improvement from dropping those. > > (That will be part of the larger patch isolating libudev usage.) > > > > I'm still seeing significant delay between the apparent command exit and > > the systemd "Started" message, but this will require a little more work to > > prove. > > > > I applied the patch and got the very similar result. > > > ```with patch > # systemd-analyze blame > 27.679s lvm2-pvscan@134:896.service > 27.664s lvm2-pvscan@134:800.service > 27.648s lvm2-pvscan@134:960.service > 27.645s lvm2-pvscan@132:816.service > 27.589s lvm2-pvscan@134:816.service > 27.582s lvm2-pvscan@133:992.service > ... ... > > # systemctl status lvm2-pvscan@134:896.service > ... ... > Jul 03 19:43:02 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 134:896... > Jul 03 19:43:03 sle15sp2-base40g lvm[24817]: pvscan[24817] PV /dev/sdalm online, VG vg_sdalm is complet> > Jul 03 19:43:03 sle15sp2-base40g lvm[24817]: pvscan[24817] VG vg_sdalm run autoactivation. > Jul 03 19:43:03 sle15sp2-base40g lvm[24817]: 1 logical volume(s) in volume group "vg_sdalm" now active > Jul 03 19:43:30 sle15sp2-base40g systemd[1]: Started LVM event activation on device 134:896. > ``` > > the 27.679s get from 19:43:30 minus 19:43:02. > > and 27s is 10s quick than the lvm without patch (by today's test result) > ```without patch > # systemd-analyze blame > 37.650s lvm2-pvscan@133:992.service > 37.650s lvm2-pvscan@133:1008.service > 37.649s lvm2-pvscan@133:896.service > 37.649s lvm2-pvscan@134:960.service > 37.612s lvm2-pvscan@133:880.service > 37.612s lvm2-pvscan@133:864.servic > ... ... > > # systemctl status lvm2-pvscan@133:992.service > ... ... > Jul 03 19:31:28 sle15sp2-base40g systemd[1]: Starting LVM event activation on device 133:992... > Jul 03 19:31:30 sle15sp2-base40g lvm[24243]: pvscan[24243] PV /dev/sdalc online, VG vg_sdalc is complet> > Jul 03 19:31:30 sle15sp2-base40g lvm[24243]: pvscan[24243] VG vg_sdalc run autoactivation. > Jul 03 19:31:30 sle15sp2-base40g lvm[24243]: 1 logical volume(s) in volume group "vg_sdalc" now active > Jul 03 19:32:05 sle15sp2-base40g systemd[1]: Started LVM event activation on device 133:992. > ``` > > I added log in lvm2 code, the time of log "1 logical volume(s) in volume > group "vg_sdalm" now active" is the real pvscan finished time. Martin > comment in previous mail is may right, systemd is may busy and delayed to > detect/see the pvscan service completion. > > Thanks, > heming > > > _______________________________________________ > linux-lvm mailing list > linux-lvm@xxxxxxxxxx > https://listman.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ _______________________________________________ linux-lvm mailing list linux-lvm@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/