Leonid Yegoshin <Leonid.Yegoshin@xxxxxxxxxx> writes: > > + /* Prevent any threads from obtaining live FP context */ > > + atomic_set(&task->mm->context.fp_mode_switching, 1); > > + smp_mb__after_atomic(); > > + > > + /* > > + * If there are multiple online CPUs then wait until all threads > whose > > + * FP mode is about to change have been context switched. This > approach > > + * allows us to only worry about whether an FP mode switch is in > > + * progress when FP is first used in a tasks time slice. Pretty > much all > > + * of the mode switch overhead can thus be confined to cases where > mode > > + * switches are actually occuring. That is, to here. However for > the > > + * thread performing the mode switch it may take a while... > > + */ > > + if (num_online_cpus() > 1) { > > + spin_lock_irq(&task->sighand->siglock); > > + > > + for_each_thread(task, t) { > > + if (t == current) > > + continue; > > + > > + switch_count = t->nvcsw + t->nivcsw; > > + > > + do { > > + spin_unlock_irq(&task->sighand->siglock); > > + cond_resched(); > > + spin_lock_irq(&task->sighand->siglock); > > + } while ((t->nvcsw + t->nivcsw) == switch_count); > > + } > > + > > + spin_unlock_irq(&task->sighand->siglock); > > + } > > > This piece of thread walking seems to be not thread safe for newly > created thread. > Thread creation is not locked between points of copy_thread which copies > task thread flags and makeing thread visible to walking via > "for_each_thread". > > So it is possible in environment with two threads - one is creating an > another thread, another one switching FPU mode and waiting and race > condition may causes a newly thread in old mode but the rest of thread > group is in new mode. > > Besides that, it looks like in kernel with tickless mode a scheduler may > no come a long time in idle system, in extreme case - forever. Only commenting on the tickless issue... The requirement for the PR_SET_FP_MODE call is that all threads in the current thread group switch to the new mode prior to it returning. I believe that simply means there is no alternative other than for the tickless case to wait as long as it has to wait? If the prctl failed in tickless mode (or timed out) then that is likely to lead to a program failing to load its libraries and aborting. So if the only other alternative is for the prctl to fail then I'm not sure if that is any better than waiting forever. For the vast majority of cases the prctl calls to change mode will happen very early in the user-process, while it is still single threaded. These will be part of loading an application's initial set of shared libraries. Perhaps that means this corner case of a long delay is not overly dangerous anyway? Thanks, Matthew