On Mon, 27 May 2019 at 11:44, Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxxxx> wrote: > > > -----Original Message----- > > From: Ard Biesheuvel [mailto:ard.biesheuvel@xxxxxxxxxx] > > Sent: Friday, May 24, 2019 11:45 AM > > To: Pascal Van Leeuwen <pvanleeuwen@xxxxxxxxxxxxxxxx> > > Cc: Christophe Leroy <christophe.leroy@xxxxxx>; linux-crypto@xxxxxxxxxxxxxxx > > Subject: Re: another testmgr question > > > > On Fri, 24 May 2019 at 11:34, Pascal Van Leeuwen > > <pvanleeuwen@xxxxxxxxxxxxxxxx> wrote: > > > > > > > All userland clients of the in-kernel crypto use it specifically to > > > > access h/w accelerators, given that software crypto doesn't require > > > > the higher privilege level (no point in issuing those AES CPU > > > > instructions from the kernel if you can issue them in your program > > > > directly) > > > > > > > > Basically, what is used is a socket interface that can block on > > > > read()/write(). So the userspace program doesn't need to be aware of > > > > the asynchronous nature, it is just frozen while the calls are being > > > > handled by the hardware. > > > > > > > With all due respect, but if the userland application is indeed > > > *frozen* while the calls are being handled, then that seems like its > > > pretty useless - for symmetric crypto, anyway - as performance would be > > > dominated by latency, not throughput. > > > Hardware acceleration would almost always lose that compared to a local > > > software implementation. > > > I certainly wouldn't want such an operation to end up at my driver! > > > > > > > Again, you are making assumptions here that don't always hold. Note that > > - a frozen process frees up the CPU to do other things while the > > crypto is in progress; > > - h/w crypto is typically more power efficient than CPU crypto; > > - several userland programs and in-kernel users may be active at the > > same time, so the fact that a single user sleeps doesn't mean the > > hardware is used inefficiently > > > With all due respect, but you are making assumptions as well. You are > making the assumption that reducing CPU load and/or reducing power > consumption is *more* important than absolute application performance or > latency. Which is certainly not always the case. > I never said power consumption is *always* more important. You were assuming it never is. > In addition to the assumption that using the hardware will actually > *achieve* this, while that really depends on the ratio of driver overhead > (which can be quite significant, unfortunately, especially if the API was > not really created from the get-go with HW in mind) vs hardware processing > time. > Of course. > In many cases where only small amounts of data are processed sequentially, > the hardware will simply lose on all accounts ... So Linus actually did > have a point there. Hardware only wins for specific use cases. It's > important to realize that and not try and use hardware for everything. > True. But we have already painted ourselves into a corner here, since whatever we expose to userland today cannot simply be revoked. I guess you could argue that your particular driver should not be exposed to userland, and I think we may even have a CRYPTO_ALG_xxx flag for that. But even if that does happen, it doesn't mean you can stop caring about zero length inputs :-)