On Mon, Apr 20, 2020 at 2:32 AM David Laight <David.Laight@xxxxxxxxxx> wrote: > Maybe kernel_fp_begin() should be passed the address of somewhere > the address of an fpu save area buffer can be written to. > Then the pre-emption code can allocate the buffer and save the > state into it. > > However that doesn't solve the problem for non-preemptive kernels. > The may need a cond_resched() in the loop if it might take 1ms (or so). > > kernel_fpu_begin() ought also be passed a parameter saying which > fpu features are required, and return which are allocated. > On x86 this could be used to check for AVX512 (etc) which may be > available in an ISR unless it interrupted inside a kernel_fpu_begin() > section (etc). > It would also allow optimisations if only 1 or 2 fpu registers are > needed (eg for some of the crypto functions) rather than the whole > fpu register set. There might be ways to improve lots of FPU things, indeed. This patch here is just a patch to Herbert's branch in order to make uniform usage of our existing solution for this, fixing the existing bug. I wouldn't mind seeing more involved and better solutions in a patchset for crypto-next. Will follow up with your suggestion in a different thread, so as not to block this one.