Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

Eric Biggers <ebiggers@xxxxxxxxxx> · Thu, 5 Nov 2020 10:21:55 -0800

On Thu, Nov 05, 2020 at 05:05:53PM +0800, Li Qiang wrote:
> 
> 
> 在 2020/11/5 15:51, Ard Biesheuvel 写道:
> > Note that NEON intrinsics can be compiled for 32-bit ARM as well (with
> > a bit of care - please refer to lib/raid6/recov_neon_inner.c for an
> > example of how to deal with intrinsics that are only available on
> > arm64) and are less error prone, so intrinsics should be preferred if
> > feasible.
> > 
> > However, you have still not explained how optimizing Adler32 makes a
> > difference for a real-world use case. Where is libdeflate used on a
> > hot path?
> > .
> 
> Sorry :(, I have not specifically searched for the use of this algorithm
> in the kernel.
> 
> When I used perf to test the performance of the libz library before,
> I saw that the adler32 algorithm occupies a lot of hot spots.I just
> saw this algorithm used in the kernel code, so I think optimizing this
> algorithm may have some positive optimization effects on the kernel.:)

Adler32 performance is important for zlib compression/decompression, which has a
few use cases in the kernel such as btrfs compression.  However, these days
those few kernel use cases are mostly switching to newer algorithms like lz4 and
zstd.  Also as I mentioned, your patch doesn't actually wire up your code to be
used by the kernel's implementation of zlib compression/decompression.

I think you'd be much better off contributing to a userspace project, where
DEFLATE/zlib/gzip support still has a long tail of use cases.  The official zlib
isn't really being maintained and isn't accepting architecture-specific
optimizations, but there are some performance-oriented forks of zlib (e.g.
https://chromium.googlesource.com/chromium/src/third_party/zlib/ and
https://github.com/zlib-ng/zlib-ng), as well as other projects like libdeflate
(https://github.com/ebiggers/libdeflate).  Generally I'm happy to accept
architecture-specific optimizations in libdeflate, but they need to be testable.

- Eric