On Wed, Sep 18, 2002 at 01:44:57AM +0200, Kevin D. Kissell wrote: > > I am now facing a couple of choices in the implementation and > > like to hear back from you. Those choices mainly differ at when we > > should save fpu context and when we should restore it. > > > > 1) always blindly save and restore during context switch (switch_to()) > > > > Not interesting. Just list it here for completeness. > > Not everything that is interesting is worth doing. > And not everything worth doing is interesting. > > > 2) save PFU context when process is switched off *only if* > > FPU is used in the last run. > > restore FPU context on next use of FPU. > > > > Need to use an additional flag to remember whether it is used > > in the current run. > > > > Perhaps overridding used_math? In that > > case, used_math == 2 indicates it used in the current run. > > used_math is set back to 1 when process is switched off. > > > > Very simply to implement. > > It's still somewhat less simple than the current hack, > and *that* was gotten wrong repeatedly. > It is much simpler than the current hack, because it does not maintain last_task_used_math or any "lazy switch" concepts. > > I'd much prefer something that is simple and processor-local, > even if it may be less optimal in some corner cases. For example, > Why not simply use CP0.Status.CU1 as a "dirty" bit? If it's set > when a process switches out, the FPU state gets saved, and CU1 > cleared. If it's not set when a process hits an FP instruction, > CU1 gets set and the context gets loaded. This involves no > access whatever to shared control variables, indeed, it doesn't > even go to memory to make the decision. It will, of course, save > some FP contexts that don't need saving, but it is well behaved > in the cases I care most about - it avoids saving/restoring FPRs > of code that is doing no FP whatsoever, and it ensures that > whenever a thread starts up, whatever CPU its on, its full > context is available to that CPU, no (coherent) questions asked. > This is basically 2) except for dirty bit difference. My current implementaion uses bit:1 in task->used_math flag for "dirty" bit purpose. I was thinking to use CU1, but it turns out to be a non- reliable indicator. Several places inside the kernel turning on/off FPUs. Perhaps after further cleanups, these offending places may become obsolete. I will keep this option in my mind. Jun