On Mon, Dec 03, 2018 at 03:13:37PM +0100, Ard Biesheuvel wrote: > On Sun, 2 Dec 2018 at 11:47, Martin Willi <martin@xxxxxxxxxxxxxx> wrote: > > > > > > > To improve responsiveness, disable preemption for each step of the > > > walk (which is at most PAGE_SIZE) rather than for the entire > > > encryption/decryption operation. > > > > It seems that it is not that uncommon for IPsec to get small inputs > > scattered over multiple blocks. Doing FPU context saving for each walk > > step then can slow down things. > > > > An alternative approach could be to re-enable preemption not based on > > the walk steps, but on the amount of bytes processed. This would > > satisfy both users, I guess. > > > > In the long run we probably need a better approach for FPU context > > saving, as this really hurts performance-wise. For IPsec we should find > > a way to avoid the (multiple) per-packet FPU save/restores in softirq > > context, but I guess this requires support from process context > > switching. > > > > At Jason's Zinc talk at plumbers, this came up, and apparently someone > is working on this, i.e., to ensure that on x86, the FPU restore only > occurs lazily, when returning to userland rather than every time you > call kernel_fpu_end() [like we do on arm64 as well] > > Not sure what the ETA for that work is, though, nor did I get the name > of the guy working on it. Thanks for the suggestion; I'll replace this with a patch that re-enables preemption every 4 KiB encrypted. That also avoids having to do a kernel_fpu_begin(), kernel_fpu_end() pair just for hchacha_block_ssse3(). But yes, I'd definitely like repeated kernel_fpu_begin(), kernel_fpu_end() to not be incredibly slow. That would help in a lot of other places too. - Eric