On Wed, Oct 14, 2009 at 6:49 AM, Jason Nymble <jason.nymble@xxxxxxxxx> wrote: > > On 14 Oct 2009, at 5:26 AM, Peter Teoh wrote: > >> On Thu, Oct 8, 2009 at 5:10 PM, Jason Nymble <jason.nymble@xxxxxxxxx> >> wrote: >>> >>> Can one safely use SSE2 instructions in kernel module code? Or are those >>> 128bit registers not preserved across kernel/userspace context switch? >>> >> >> should be no problem...the kernel used "fxsave" (assembly instruction) >> to backup all the FPU/SSE/MMX registers during context switch.... >> >> from __switch_to()-->__lazy_fpu()-->save_init_fpu()--->fxsave() >> (function): >> >> 128 static inline void fxsave(struct task_struct *tsk) >> 129 { >> 130 /* Using "rex64; fxsave %0" is broken because, if the memory >> operand >> 131 uses any extended registers for addressing, a second REX >> prefix >> 132 will be generated (to the assembler, rex64 followed by >> semicolon >> 133 is a separate instruction), and hence the 64-bitness is >> lost. */ >> 134 #if 0 >> 135 /* Using "fxsaveq %0" would be the ideal choice, but is >> only supported >> 136 starting with gas 2.16. */ >> 137 __asm__ __volatile__("fxsaveq %0" >> 138 : "=m" (tsk->thread.xstate->fxsave)); >> 139 #elif 0 >> 140 /* Using, as a workaround, the properly prefixed form below >> isn't >> 141 accepted by any binutils version so far released, >> complaining that >> 142 the same type of prefix is used twice if an extended >> register is >> 143 needed for addressing (fix submitted to mainline >> 2005-11-21). */ >> 144 __asm__ __volatile__("rex64/fxsave %0" >> 145 : "=m" (tsk->thread.xstate->fxsave)); >> 146 #else >> 147 /* This, however, we can work around by forcing the >> compiler to select >> 148 an addressing mode that doesn't require extended registers. >> */ >> 149 __asm__ __volatile__("rex64/fxsave (%1)" >> 150 : "=m" (tsk->thread.xstate->fxsave) >> 151 : "cdaSDb" >> (&tsk->thread.xstate->fxsave)); >> 152 #endif >> 153 } >> 154 >> 155 static inline void __save_init_fpu(struct task_struct *tsk) >> 156 { >> 157 if (task_thread_info(tsk)->status & TS_XSAVE) >> 158 xsave(tsk); >> 159 else >> 160 fxsave(tsk); >> 161 >> 162 clear_fpu_state(tsk); >> 163 task_thread_info(tsk)->status &= ~TS_USEDFPU; >> 164 } >> 165 >> >>> >> >> > > Interesting, thank you. How about protecting SSE2 code from preemptive > scheduling within the kernel itself? Should one disable preemption and/or > interrupts around code which performs SSE2 in the kernel? > it should be treated the same way as any other registers....eax/rax/ebx/rbx etc....right. no protection should be needed. (for certain registers - definitely not the SSEs - there is a different set of problem to worry about when doing inline assembly - ABI violation). http://www.x86-64.org/documentation/abi.pdf this is the ABI which gcc followed when generating instructions set.....and so all the linux kernel + applications followed this convention. (for example, if u read the kernel in assembly, every functions have a certain implicit knowledge of which registers can be used and cannot be used, or is holding to the input arguments....following the ABI conventions). no problem if u program in C. but when u write inline assembly, gcc does not check for ABI violation....and it will follow your instructions. data corruptions may result if there exists any conflicts between your inline assembly and those of gcc-generated assembly. -- Regards, Peter Teoh -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ