On Tue, Jan 15, 2019 at 12:54 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > > On 1/15/19 12:26 PM, Andy Lutomirski wrote: > > I don't think we'd ever want kernel_fpu_end() to restore anything, > > right? I'm a bit confused as to when this optimization would actually > > be useful. > > Using AVX-512 as an example... > > Let's say there was AVX-512 state, and a kernel_fpu_begin() user only > used AVX2. We could totally avoid doing *any* AVX-512 state save/restore. > > The init optimization doesn't help us if there _is_ AVX-512 state, and > the modified optimization only helps if we recently did a XRSTOR at > context switch and have not written to AVX-512 state since XRSTOR. > > This probably only matters for AVX-512-using apps that have run on a > kernel with lots of kernel_fpu_begin()s that don't use AVX-512. So, not > a big deal right now. On top of this series, this gets rather awkward, I think -- now we need to be able to keep track of a state in which some of the user registers live in the CPU and some live in memory, and we need to be able to do the partial restore if we go back to user mode like this. We also need to be able to do a partial save if we end up context switching. This seems rather complicated. Last time I measured it (on Skylake IIRC), a full save was only about twice as slow as a save that saved nothing at all, so I think we'd need numbers.