Re: [RFC] FPU context switch

Jun Sun <jsun@mvista.com> · Tue, 17 Sep 2002 16:45:20 -0700

On Wed, Sep 18, 2002 at 01:44:57AM +0200, Kevin D. Kissell wrote:
> > I am now facing a couple of choices in the implementation and 
> > like to hear back from you.  Those choices mainly differ at when we 
> > should save fpu context and when we should restore it.
> > 
> > 1) always blindly save and restore during context switch (switch_to())
> > 
> > Not interesting.  Just list it here for completeness.
> 
> Not everything that is interesting is worth doing.
> And not everything worth doing is interesting.
> 
> > 2) save PFU context when process is switched off *only if* 
> >    FPU is used in the last run.  
> >    restore FPU context on next use of FPU.
> > 
> > Need to use an additional flag to remember whether it is used
> > in the current run.
> >
> > Perhaps overridding used_math?  In that
> > case, used_math == 2 indicates it used in the current run.
> > used_math is set back to 1 when process is switched off.
> > 
> > Very simply to implement.
> 
> It's still somewhat less simple than the current hack,
> and *that* was gotten wrong repeatedly.
>

It is much simpler than the current hack, because it does not
maintain last_task_used_math or any "lazy switch" concepts.

> 
> I'd much prefer something that is simple and processor-local,
> even if it may be less optimal in some corner cases.  For example,
> Why not simply use CP0.Status.CU1 as a "dirty" bit?  If it's set 
> when a process switches out, the FPU state gets saved, and CU1 
> cleared.  If it's not set when a process hits an FP instruction, 
> CU1 gets set and the context gets loaded. This involves no 
> access whatever to shared control variables, indeed, it doesn't 
> even go to memory to make the decision. It will, of course, save 
> some FP contexts that don't need saving, but it is well behaved
> in the cases I care most about - it avoids saving/restoring FPRs
> of code that is doing no FP whatsoever, and it ensures that
> whenever a thread starts up, whatever CPU its on, its full
> context is available to that CPU, no (coherent) questions asked.
> 

This is basically 2) except for dirty bit difference.

My current implementaion uses bit:1 in task->used_math flag for 
"dirty" bit purpose.

I was thinking to use CU1, but it turns out to be a non-
reliable indicator.  Several places inside the kernel
turning on/off FPUs.

Perhaps after further cleanups, these offending places may become
obsolete.  I will keep this option in my mind.

Jun