On Thu, 13 Feb 2025 at 05:17, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Feb 12, 2025 at 07:47:11AM -0800, Eric Biggers wrote: > > [ This patchset keeps getting rejected by Herbert, who prefers a > > complex, buggy, and slow alternative that shoehorns CPU-based hashing > > into the asynchronous hash API which is designed for off-CPU offload: > > https://lore.kernel.org/linux-crypto/cover.1730021644.git.herbert@xxxxxxxxxxxxxxxxxxx/ > > This patchset is a much better way to do it though, and I've already > > been maintaining it downstream as it would not be reasonable to go the > > asynchronous hash route instead. Let me know if there are any > > objections to me taking this patchset through the fsverity tree, or at > > least patches 1-5 as the dm-verity patches could go in separately. ] > > Yes I object. While I very much like this idea of parallel hashing > that you're introducing, shoehorning it into shash is restricting > this to storage-based users. > > Networking is equally able to benefit from paralell hashing, and > parallel crypto (in particular, AEAD) in general. In fact, both > TLS and IPsec can benefit directly from bulk submission instead > of the current scheme where a single packet is processed at a time. > > But thanks for the reminder and I will be posting my patches > soon. > I have to second Eric here, simply because his work has been ready to go for a year now, while you keep rejecting it on the basis that you're creating something better, and the only thing you have managed to produce in the meantime didn't even work. I strongly urge you to accept Eric's work, and if your approach is really superior, it should be fairly easy making that point with working code once you get around to producing it, and we can switch over the users then. The increased flexibility you claim your approach will have does not mesh with my understanding of where the opportunities for improvement are: CPU-based SHA can be tightly interleaved at the instruction level to have a performance gain of almost 2x. Designing a more flexible ahash based multibuffer API that can still take advantage of this to the same extent is not straight-forward, and you going off and cooking up something by yourself for months at a time does not inspire confidence that this will converge any time soon, if at all. Also, your network use case is fairly theoretical, whereas the fsverity and dm-verity code runs on 100s of millions of mobile phones in the field, so sacrificing any performance of the latter to serve the former seems misguided to me. So could you please remove yourself from the critical path here, and merge this while we wait for your better alternative to materialize? Thanks, Ard.