Hi Colin, On 4/9/23 16:55, Colin Watson wrote: > On Sun, Apr 09, 2023 at 03:58:28PM +0200, Alejandro Colomar wrote: >> $ man -Kaw RLIMIT_NOFILE | sort | uniq -c >> 3 /opt/local/man/share/man/man2/dup.2 >> 2 /opt/local/man/share/man/man2/fcntl.2 >> 5 /opt/local/man/share/man/man2/getrlimit.2 >> 3 /opt/local/man/share/man/man2/open.2 >> 1 /opt/local/man/share/man/man2/pidfd_getfd.2 >> 1 /opt/local/man/share/man/man2/pidfd_open.2 >> 2 /opt/local/man/share/man/man2/poll.2 >> 1 /opt/local/man/share/man/man2/seccomp_unotify.2 >> 4 /opt/local/man/share/man/man2/select.2 >> >> Those numbers coincide with 1+ the number of symlinks for each of the >> pages. For example, see select.2: > > Thanks for the report. Fixed by this commit: > > https://gitlab.com/man-db/man-db/-/commit/7ef30573a7023eb78bf70a34edaa4e3906531993 Heh, that was fast :) As a side effect of not reading too many files, performance improved considerably for bzip2 (~3x), and for gzip (~2x). I built man from source (tweaking with -O3, so I cheated a little bit), and here are the results: $ export MANPATH=/tmp/man/gz_/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 17 0.19 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.14 $ export MANPATH=/tmp/man/bz2/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 17 3.05 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.20 $ export MANPATH=/tmp/man/man/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 17 0.52 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" 17 0.01 Please consider this a new bug report, about performance. See the last block of commands. man(1) takes half a second, while my loop with find(1) and grep(1) is almost non-measurable. I could understand that man(1) has some overhead, but 52x feels like there's some serious performance problem; especially when man(1) is faster reading uncompressed pages (see at the top). Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature