Re: Accessibility of man pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

(Colin, something for you near the end; search ‘interesting’.)

Eli wrote:
> Dirk wrote:
> > $ find /usr/share/man -type f -exec bzgrep -l RLIMIT_NOFILE {} \;
...
> > find...             man -K...
> >
> > real 107.45         real 96.34
> > user 117.06         user 70.11
> > sys 14.43           sys 26.86
...
> > $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE
...
> > real 24.30
> > user 32.34
> > sys 6.84
>
> Multiprocessing, obviously.  Your CPU has more than one execution
> unit, so the pipe via xargs runs 'find' and 'bzgrep' in parallel on
> two different execution units.  By contrast, "find -exec" runs them
> sequentially, in a single thread.

No, I don't think it's that.

With the first, find(1) does stop whilst waiting for bzgrep to grep a
single file.  bzgrep may or may not run on the same core.  The important
thing is the one bzgrep per file and its fork() and exec() overhead.

The second has find fill a pipe's buffer with paths and when that's
full, xargs's read can return.  This continues until xargs either reads
end-of-file or reaches the argv[] limits.  It then runs a single bzgrep
with many filenames.  The fork+exec overhead is much reduced.

bzgrep is a shell script and has overhead before it gets to the
argument-processing loop.  That overhead is suffered many times if
bzgrep is run once per file.

The *zgrep scripts are a poor option in general due to this
one-grep-per-file overhead.  Better than nothing, but a grep which can
internally decompress all the different compression formats avoids this
shell overhead.

Here is an example.  260 files causes eight times as many clone(2)s,
i.e. forks.  I've added an extra ‘×...’ column.  The ls and xargs will
complete their work nearly instantly.  All the wall-clock time is the
single run of zgrep.

    $ pwd
    /usr/share/man/man7
    $ ls *.gz | wc -l
    260
    $
    $ ls *.gz | LC_ALL=C strace -fc xargs -rd\\n zgrep -H not-to-be-found
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- ---------    ------ ----------------
     93.70   27.510039        7555      3641 ×14  1560 wait4
      0.85    0.248763          26      9389           mmap
      0.68    0.198674          11     17166           rt_sigprocmask
      0.56    0.165702          35      4691           mprotect
      0.52    0.153146           6     22637           rt_sigaction
      0.50    0.146029          21      6780           read
      0.43    0.125451          10     12235      1040 close
      0.31    0.091715          29      3132           openat
      0.25    0.073542          11      6513       522 fcntl
      0.24    0.070822          12      5728      2080 stat
      0.23    0.068825          16      4171           fstat
      0.20    0.057703          24      2348           brk
      0.19    0.054838          69       786         4 execve
      0.18    0.052849          25      2081 ×8        clone
      0.17    0.051089          17      2862       782 access
      0.15    0.043284          55       782           munmap
      0.14    0.040012          48       819           write
      0.11    0.031992          11      2870       260 lseek
      0.11    0.031393           8      3902           dup2
      0.08    0.023038          22      1041           pipe
      0.07    0.021190          13      1560           rt_sigreturn
      0.06    0.018363          11      1564       782 arch_prctl
      0.05    0.016013           7      2081 ×8        getgid
      0.05    0.015314           7      2081 ×8        getegid
      0.05    0.015251           7      2081 ×8        getuid
      0.05    0.014780           7      2081 ×8        geteuid
      0.02    0.004703          18       260 ×1        sigaltstack
      0.01    0.003685          13       264 ×1        prlimit64
      0.01    0.003523           1      2084 ×8        getpid
      0.01    0.003443          13       260 ×1        set_tid_address
      0.01    0.003425          13       260 ×1        set_robust_list
      0.00    0.000094          23         4           getdents64
      0.00    0.000053          17         3         2 ioctl
      0.00    0.000033          16         2           poll
      0.00    0.000018          18         1           sysinfo
      0.00    0.000014          14         1           getppid
      0.00    0.000013          13         1           uname
      0.00    0.000013          13         1           getpgrp
    ------ ----------- ----------- --------- --------- ----------------
    100.00   29.358834                128163      7032 total

Compare with running sh(1) to run zcat and grep on each bunch of xargs's files.

    $ ls *.gz | LC_ALL=C strace -fc sh -c 'xargs -rd\\n zcat | grep not-to-be-found'
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- ---------    ------ ----------------
     82.18    0.150049       37512         4         1 wait4
      5.92    0.010814          12       881           read
      2.25    0.004116          14       286 ×1        openat
      2.17    0.003966          13       299 ×1        write
      1.33    0.002432           8       301 ×1      2 close
      1.24    0.002263          30        74           mmap
      1.19    0.002166           7       285 ×1        fstat
      0.58    0.001060          17        62        16 stat
      0.52    0.000954          34        28           mprotect
      0.42    0.000766          12        63           rt_sigaction
      0.35    0.000642          27        23         6 access
      0.30    0.000543          24        22           rt_sigprocmask
      0.19    0.000346          17        20         1 lseek
      0.17    0.000310          62         5           munmap
      0.14    0.000250          13        18           getuid
      0.13    0.000242          18        13         4 ioctl
      0.13    0.000238          13        18           getegid
      0.13    0.000236          13        18           geteuid
      0.11    0.000204          11        18           brk
      0.11    0.000199          11        18           getgid
      0.06    0.000116          11        10         5 arch_prctl
      0.06    0.000106           9        11         1 fcntl
      0.05    0.000083          10         8           getpid
      0.04    0.000082          13         6           prlimit64
      0.04    0.000076          38         2           pipe
      0.03    0.000061          20         3           clone
      0.03    0.000047          23         2           getpgrp
      0.02    0.000044          22         2           sysinfo
      0.02    0.000033           3         9         4 execve
      0.02    0.000031          10         3           dup2
      0.02    0.000030          30         1           set_tid_address
      0.01    0.000023          11         2           uname
      0.01    0.000022          22         1           set_robust_list
      0.01    0.000014           7         2           poll
      0.01    0.000014          14         1           rt_sigreturn
      0.01    0.000010           5         2           getppid
      0.00    0.000007           1         4           getdents64
      0.00    0.000000           0         1           sigaltstack
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.182595                  2526        40 total
    $

Fifty times fewer system calls.  Especially expensive ones.

Now, here's the interesting bit.  man here also forks once per file.
Presumably so the child can decompress the file and write to a pipe with
the existing search code in the parent reading from the other end
without caring the file is compressed.  Removing the pipe and fork could
speed things up a bit.  A function pointer would be one way.

    $ LC_ALL=C strace -fc man -Ks7 not-to-be-found
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- ---------    ------ ----------------
     61.90    1.890948          14    132630           read
     32.05    0.979069        1882       520 ×2    260 wait4
      1.68    0.051233          98       520 ×2    260 seccomp
      0.90    0.027443          13      2096 ×8        close
      0.61    0.018720          18      1024           write
      0.59    0.017941          17      1040 ×4        rt_sigprocmask
      0.53    0.016137          27       585        37 stat
      0.49    0.015120          27       551        15 openat
      0.26    0.007975          30       260 ×1        pipe
      0.15    0.004705          18       260 ×1        clone
      0.14    0.004345          16       260 ×1        rt_sigreturn
      0.14    0.004254          16       263 ×1        ioctl
      0.13    0.003824          14       262 ×1        getpid
      0.10    0.003067           1      1573           rt_sigaction
      0.08    0.002581           9       273           fstat
      0.07    0.002052           3       520 ×2        prctl
      0.07    0.002040           7       263 ×1        lseek
      0.06    0.001959           7       260 ×1        dup
      0.04    0.001115           2       520 ×2        dup2
      0.01    0.000204          17        12           brk
      0.00    0.000000           0       145           lstat
      0.00    0.000000           0        31           mmap
      0.00    0.000000           0        12           mprotect
      0.00    0.000000           0         1           munmap
      0.00    0.000000           0         1         1 access
      0.00    0.000000           0         1           execve
      0.00    0.000000           0         3           fcntl
      0.00    0.000000           0         6           readlink
      0.00    0.000000           0         1           umask
      0.00    0.000000           0         1           sysinfo
      0.00    0.000000           0         1           getuid
      0.00    0.000000           0         1           getgid
      0.00    0.000000           0         1           geteuid
      0.00    0.000000           0         1           getegid
      0.00    0.000000           0         3           fstatfs
      0.00    0.000000           0         2         1 arch_prctl
      0.00    0.000000           0         8           getdents64
    ------ ----------- ----------- --------- --------- ----------------
    100.00    3.054732                143911       574 total
    $

-- 
Cheers, Ralph.



[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux