On 6/1/21 11:35 AM, Dave Hansen wrote: > On 6/1/21 11:14 AM, Andy Lutomirski wrote: >>> Nit: Could we move the detailed comments about TIF_NEED_FPU_LOAD >>> right next to the fpu__initialize() call? It would make it >>> painfully obvious which call is responsible. The naming isn't >>> super helpful here. >>> >> Hmm. I was actually thinking of open-coding it all. >> fpu__initialize() and fpu__drop() have very few callers, and I’m not >> even convinced that the other callers are calling them for a valid >> reason. > > Fine by me. We all know what the TIF flag does, but its connection to > this code is totally opaque. It's a place where removing the > abstraction might actually make sense. > After mulling this over for a while, I think the TIF flag's existence is fine, but the APIs are all awful. I think we should give the states in the little FPU state machine explicit names: ACTIVE: If the current CPU's FPU regs are ACTIVE, then those regs are the current task's registers and current->fpu does not necessarily contain the current task's registers. (This is !TIF_NEED_FPU_LOAD.) INACTIVE: If the current CPU's FPU regs are INACTIVE, then those regs belong to the kernel and do not belong to any particular task. The current task's FPU registers are in current->fpu. PRESERVED: If the current CPU's FPU regs are PRESERVED for a given task, then they are guaranteed to match that task's task->fpu. It is possible to tell whether a given task's FPU regs are PRESERVED on a given CPU, but it is not possible to tell whether a given CPU's regs are PRESERVED or INACTIVE, as it is possible that fpu_fpregs_owner_ctx points to memory that has been reused for a different purpose, so dereferencing ->last_fpu is not safe. If a non-current task is running, then its FPU state may not be accessed (obviously). Similarly, in interrupt context, neither FPU regs nor current->fpu may be accessed, as interrupts can nest inside fpregs_lock()ed regions and the state may be indeterminate. If a non-current task is *stopped*, then its FPU registers may be read. If its FPU registers are invalidated, then they may also be written. fpu__prepare_read() and fpu__prepare_write() may be used for these purposes. (But fpu__prepare_read() should be rewritten. It currently copies the regs to memory but leaves the state ACTIVE, which is just silly, especially the absurd way it manages the FSAVE clobber workaround. It should just do switch_fpu_prepare().) And we add some APIs that have locking assertions: fpu__current_regs_active(): returns true if the current task's regs are ACTIVE. This replaces most checks for TIF_NEED_FPU_LOAD. Asserts !preemptible(). and more to come, but not today.