Re: [PATCH] dm-verity: hash blocks with shash import+finup when possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

On Sun, Oct 29, 2023 at 7:34 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> From: Eric Biggers <ebiggers@xxxxxxxxxx>
>
> Commit d1ac3ff008fb ("dm verity: switch to using asynchronous hash
> crypto API"), from Linux v4.12, made dm-verity do its hashing using the
> ahash API instead of the shash API.  While this added support for
> hardware (off-CPU) hashing offload, it slightly hurt performance for
> everyone else due to additional crypto API overhead.  This API overhead
> is becoming increasingly significant as I/O speeds increase and CPUs
> achieve increasingly high SHA-2 speeds using native SHA-2 instructions.
>
> Recent crypto API patches
> (https://lore.kernel.org/linux-crypto/20231022081100.123613-1-ebiggers@xxxxxxxxxx)
> are reducing that overhead.  However, it cannot be eliminated.
>
> Meanwhile, another crypto API related sub-optimality of how dm-verity
> currently implements block hashing is that it always computes each hash
> using multiple calls to the crypto API.  The most common case is:
>
>     1. crypto_ahash_init()
>     2. crypto_ahash_update() [salt]
>     3. crypto_ahash_update() [data]
>     4. crypto_ahash_final()
>
> With less common dm-verity settings, the update of the salt can happen
> after the data, or the data can require multiple updates.
>
> Regardless, each call adds some API overhead.  Again, that's being
> reduced by recent crypto API patches, but it cannot be eliminated; each
> init, update, or final step necessarily involves an indirect call to the
> actual "algorithm", which is expensive on modern CPUs, especially when
> mitigations for speculative execution vulnerabilities are enabled.
>
> A significantly more optimal sequence for the common case is to do an
> import (crypto_ahash_import(), then a finup (crypto_ahash_finup()).
> This results in as few as one indirect call, the one for finup.
>
> Implementing the shash and import+finup optimizations independently
> would result in 4 code paths, which seems a bit excessive.  This patch
> therefore takes a slightly simpler approach.  It implements both
> optimizations, but only together.  So, dm-verity now chooses either the
> existing, fully general ahash method; or it chooses the new shash
> import+finup method which is optimized for what most dm-verity users
> want: CPU-based hashing with the most common dm-verity settings.
>
> The new method is used automatically when appropriate, i.e. when the
> ahash API and shash APIs resolve to the same underlying algorithm, the
> dm-verity version is not 0 (so that the salt is hashed before the data),
> and the data block size is not greater than the page size.
>
> In benchmarks with veritysetup's default parameters (SHA-256, 4K data
> and hash block sizes, 32-byte salt), which also match the parameters
> that Android currently uses, this patch improves block hashing
> performance by about 15% on an x86_64 system that supports the SHA-NI
> instructions, or by about 5% on an arm64 system that supports the ARMv8
> SHA2 instructions.  This was with CONFIG_CRYPTO_STATS disabled; an even
> larger improvement can be expected if that option is enabled.

That's an impressive performance improvement. Thanks for the patch!

Reviewed-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>

Sami




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux