On 14 Oct 2009, at 5:26 AM, Peter Teoh wrote:
On Thu, Oct 8, 2009 at 5:10 PM, Jason Nymble
<jason.nymble@xxxxxxxxx> wrote:
Can one safely use SSE2 instructions in kernel module code? Or are
those
128bit registers not preserved across kernel/userspace context
switch?
should be no problem...the kernel used "fxsave" (assembly instruction)
to backup all the FPU/SSE/MMX registers during context switch....
from __switch_to()-->__lazy_fpu()-->save_init_fpu()--->fxsave()
(function):
128 static inline void fxsave(struct task_struct *tsk)
129 {
130 /* Using "rex64; fxsave %0" is broken because, if the
memory operand
131 uses any extended registers for addressing, a second
REX prefix
132 will be generated (to the assembler, rex64 followed
by semicolon
133 is a separate instruction), and hence the 64-bitness
is lost. */
134 #if 0
135 /* Using "fxsaveq %0" would be the ideal choice, but is
only supported
136 starting with gas 2.16. */
137 __asm__ __volatile__("fxsaveq %0"
138 : "=m" (tsk->thread.xstate->fxsave));
139 #elif 0
140 /* Using, as a workaround, the properly prefixed form
below isn't
141 accepted by any binutils version so far released,
complaining that
142 the same type of prefix is used twice if an extended
register is
143 needed for addressing (fix submitted to mainline
2005-11-21). */
144 __asm__ __volatile__("rex64/fxsave %0"
145 : "=m" (tsk->thread.xstate->fxsave));
146 #else
147 /* This, however, we can work around by forcing the
compiler to select
148 an addressing mode that doesn't require extended
registers. */
149 __asm__ __volatile__("rex64/fxsave (%1)"
150 : "=m" (tsk->thread.xstate->fxsave)
151 : "cdaSDb" (&tsk->thread.xstate-
>fxsave));
152 #endif
153 }
154
155 static inline void __save_init_fpu(struct task_struct *tsk)
156 {
157 if (task_thread_info(tsk)->status & TS_XSAVE)
158 xsave(tsk);
159 else
160 fxsave(tsk);
161
162 clear_fpu_state(tsk);
163 task_thread_info(tsk)->status &= ~TS_USEDFPU;
164 }
165
Interesting, thank you. How about protecting SSE2 code from preemptive
scheduling within the kernel itself? Should one disable preemption and/
or interrupts around code which performs SSE2 in the kernel?
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ