On Tue, Oct 17, 2017 at 5:06 PM, Russell King - ARM Linux <linux@xxxxxxxxxxxxxxx> wrote: > On Sun, Oct 15, 2017 at 10:19:45AM +0100, Gilad Ben-Yossef wrote: >> Many users of kernel async. crypto services have a pattern of >> starting an async. crypto op and than using a completion >> to wait for it to end. >> >> This patch set simplifies this common use case in two ways: >> >> First, by separating the return codes of the case where a >> request is queued to a backlog due to the provider being >> busy (-EBUSY) from the case the request has failed due >> to the provider being busy and backlogging is not enabled >> (-EAGAIN). >> >> Next, this change is than built on to create a generic API >> to wait for a async. crypto operation to complete. >> >> The end result is a smaller code base and an API that is >> easier to use and more difficult to get wrong. >> >> The patch set was boot tested on x86_64 and arm64 which >> at the very least tests the crypto users via testmgr and >> tcrypt but I do note that I do not have access to some >> of the HW whose drivers are modified nor do I claim I was >> able to test all of the corner cases. >> >> The patch set is based upon linux-next release tagged >> next-20171013. > > Has there been any performance impact analysis of these changes? I > ended up with patches for one of the crypto drivers which converted > its interrupt handling to threaded interrupts being reverted because > it caused a performance degredation. > > Moving code to latest APIs to simplify it is not always beneficial. I agree with the sentiment but I believe this one is justified. This patch set basically does 3 things: 1. Replace one immediate value (-EBUSY) by another (-EAGAIN). Mostly it's just s/EBUSY/EAGAIN/g. In very few places this resulted very trivial code changes. I don't foresee this having any effect on performance. 2. Removal of some conditions and/or conditional jumps that were used to discern between two different cases which are now now easily tested for by the different return value. If at all, this will be an increase in performance, although I don't expect it to be noticeable. 3. Replacing a whole bunch of open coded code and data structures which were pretty much cut and pasted from the Documentation and therefore identical, with a single copy thereof. Every place that I found that deviated slightly from the identical pattern, it turned out to be a bug of some sorts and patches for those were sent and accepted already. So, we might be losing a few inline optimization opportunities but we're gaining better cache utilization. Again, I don't expect any of this to have a noticeable effect to either direction. I did run the changed code as best I could and did not notice any performance changes and none of the testers and maintainers that ACKed mentioned any. Having said that, it's a big change that touches many places, sub-systems and drivers. I do not claim to have thoroughly tested for performance all the changes in person. In some cases, I don't even have access to the specialized hardware. I did get a reasonable amount of review and testers I believe - would always love to see more :-) Many thanks, Gilad -- Gilad Ben-Yossef Chief Coffee Drinker "If you take a class in large-scale robotics, can you end up in a situation where the homework eats your dog?" -- Jean-Baptiste Queru