Re: [PATCH] kbuild: modinst: Enable multithread xz compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 24, 2023 at 9:13 PM André Almeida <andrealmeid@xxxxxxxxxx> wrote:
>
> Hi Masahiro,
>
> Em 24/02/2023 02:38, Masahiro Yamada escreveu:
> > On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@xxxxxxxxxx> wrote:
> >>
> >> As it's done for zstd compression, enable multithread compression for
> >> xz to speed up module installation.
> >>
> >> Signed-off-by: André Almeida <andrealmeid@xxxxxxxxxx>
> >> ---
> >>
> >> On my setup xz is a bottleneck during module installation. Here are the
> >> numbers to install it in a local directory, before and after this patch:
> >>
> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> >> Executed in  100.08 secs
> >>
> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
> >> Executed in   28.60 secs
> >
> >
> > Heh, this is an interesting benchmark.
> >
> > Without this patch, you ran 16 processes of 'xz' in parallel
> > since you gave -j16.
> >
> > You created multi-threads in each xz process, then you got 3x faster.
> > What made it happen?
> >
> >
>
> During the modules installation in my setup, the build system would
> spend most of it's time compressing big modules (such as the 350M
> amdgpu.ko) in a single thread, with 15 idles threads. Enabling
> multithread allowed amdgpu to be compressed really fast.

It is a corner case, isn't it?
amdgpu.ko appears early in modules.order.
In most use-cases, other *.ko will fill the idle threads.


xz(1) says
  Setting threads to a special value 0 makes xz use up to as many threads
  as the processor(s) on the system support.


So, 'make -j$(nproc) modules_install'
will have (nproc * nproc) threads at maximum.

Of course, this is a theoretical calculation.
The actual number of spawned threads will be much less,
but spawning too many threads may not be nice.
For your case, Nathan's suggestion will do.




>
> The real performance improvement during modules compression is not
> compressing as many small modules as possible in parallel, but
> compressing the big ones in multithread, that proved to be the
> bottleneck in my setup.
>
>  > How many threads can your system run?
>
> $ nproc
> 16
>
> >
> > I did not get such an improvement in my testing.
> > In my machine $(nproc) is 24.
> >
> >
> > [Without this patch]
> >
> > $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> >
> > real 0m33.965s
> > user 10m6.118s
> > sys 0m37.231s
> >
> > [With this patch]
> >
> > $ time make INSTALL_MOD_PATH=/tmp/inst1  modules_install -j$(nproc)
> >
> > real 0m32.568s
> > user 10m4.472s
> > sys 0m39.132s
> >
> >
>
> I can see that my patch did not introduce performance regressions to
> your setup, at least.
>
> >
> > Given that GNU Make provides the parallel execution environment,
> > you can control the number of processes of 'xz'.
> >
> > There is no point in forcing multi-threading, which the user
> > did not ask or ever want.
> >
> >
>
> Should we drop -T0 from zstd then? Is currently forcing multi-threading.


I think yes.



--
Best Regards
Masahiro Yamada




[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux