On Thu, 6 Jun 2024 at 10:08, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Jun 06, 2024 at 09:55:56AM +0200, Ard Biesheuvel wrote: > > > > So again, how would that work for ahash falling back to shash. Are you > > saying every existing shash implementation should be duplicated into > > an ahash so that the multibuffer optimization can be added? shash is a > > public interface so we cannot just remove the existing ones and we'll > > end up carrying both forever. > > It should do the same thing for ahash algorithms that do not support > multiple requests. IOW it should process the requests one by one. > That is not what I am asking. Are you suggesting that, e.g., the arm64 sha2 shash implementation that is modified by this series should instead expose both an shash as before, and an ahash built around the same asm code that exposes the multibuffer capability? > > Sure, but the block I/O world is very different. Forcing it to use an > > API modeled after how IPsec might use it seems, again, unreasonable. > > It's not different at all. You can see that by the proliferation > of kmap calls in fs/verity. It's a fundamental issue. You can't > consistently get a large contiguous allocation beyond one page due > to fragmentation. So large data is always going to be scattered. > I don't think this is true for many uses of the block layer. > BTW, I'm all for elminating the overhead when you already have a > linear address for scattered memory, e.g., through vmalloc. We > should definitely improve our interface for ahash/skcipher/aead so > that vmalloc addresses (as well as kmalloc virtual addresses by > extension) are supported as first class citizens, and we don't turn > them into SG lists unless it's necessary for DMA. > Yes, this is something I've been pondering for a while. An ahash/skcipher/aead with CRYPTO_ALG_ASYNC cleared (which would guarantee that any provided VA would not be referenced after the algo invocation returns) should be able to consume a request that carries virtual addresses rather than SG lists. Given that it is up to the caller to choose between sync and async, it would be in a good position also to judge whether it wants to use stack/vmalloc addresses. I'll have a stab at this.