On 08/04/2011 02:44 AM, Herbert Xu wrote:
On Sun, Jul 24, 2011 at 07:53:14PM +0200, Mathias Krause wrote:
With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.
Were you testing this on the transmit side or the receive side?
As the IPsec receive code path usually runs in a softirq context,
does this code have any effect there at all?
This is pretty similar to the situation with the Intel AES code.
Over there they solved it by using the asynchronous interface and
deferring the processing to a work queue.
I have vague plans to clean up extended state handling and make
kernel_fpu_begin work efficiently from any context. (i.e. the first
kernel_fpu_begin after a context switch could take up to ~60 ns on Sandy
Bridge, but further calls to kernel_fpu_begin would be a single branch.)
The current code that handles context switches when user code is using
extended state is terrible and will almost certainly become faster in
the near future.
Hopefully I'll have patches for 3.2 or 3.3.
IOW, please don't introduce another thing like the fpu crypto module
quite yet unless there's a good reason. I'm looking forward to deleting
the fpu module entirely.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html