Am 24.04.2013 11:42, schrieb Will Deacon: > Hi Andrew, > > On Tue, Apr 23, 2013 at 11:42:22PM +0100, André Hentschel wrote: >> Am 23.04.2013 11:15, schrieb Will Deacon: >>> You could introduce `get' tls functions, which don't do anything for CPUs >>> without the relevant registers. >> >> Before i have another round of testing and patch formatting/sending, >> what about the untested patch below? > > Ok. Comments inline. Thanks for them. My first kernel patch was adding an include, this second patch is a rather hard one. So every comment is appreciated. >> #ifdef __ASSEMBLY__ >> + .macro get_tls2_none, tp, tmp1 >> + .endm > > Cosmetic, but these are really horrible macro names. I guess that's only about removing '2'? >> + .macro get_tls2_v6, tp, tmp1 >> + ldr \tmp1, =elf_hwcap >> + ldr \tmp1, [\tmp1, #0] >> + tst \tmp1, #HWCAP_TLS @ hardware TLS available? >> + mrcne p15, 0, \tmp1, c13, c0, 2 @ get user r/w TLS register >> + strne \tmp1, [\tp, #4] > > You could factor out some of this hwcap checking now that it's used by both > set and get. Sure, but the code would still run twice (unlike my PATCHv2) >> + ldrdne \tmp1, \tmp2, [\tp] > > Does this work for big-endian CPUs? I'd say yes. >> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c >> index 047d3e4..6138eb1 100644 >> --- a/arch/arm/kernel/process.c >> +++ b/arch/arm/kernel/process.c >> @@ -395,7 +395,8 @@ copy_thread(unsigned long clone_flags, unsigned long stack_start, >> clear_ptrace_hw_breakpoint(p); >> >> if (clone_flags & CLONE_SETTLS) >> - thread->tp_value = childregs->ARM_r3; >> + thread->tp_value[0] = childregs->ARM_r3; >> + thread->tp_value[1] = current_thread_info()->tp_value[1]; >> > > This still isn't correct. Imagine the following sequence of events: > > - Task foo writes its TPIDRURW register from userspace and then issues a > fork() system call. No context switch occurs between these two events. > > - We start creating the child task, bar, and end up in copy_thread with > the `thread' pointing at foo's struct thread_info, which contains the > *old* TPIDRURW value. > > - We copy out the stale value into bar, which is then scheduled with an > old TPIDRURW value. > > The solution is to reload the value sitting in the register in copy_thread, > rather than relying on the thread_info being up-to-date. That's why I > previously suggested not using asm macros for the getters. Thanks for the informations. Further questions: Where would i place this functions? In tls.h as inline functions? How should that function look like? Containing compile-time checks and only using assembler for mrc instructions? Wouldn't that be much overhead in __switch_to? >> diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c >> index 03deeff..2bc1514 100644 >> --- a/arch/arm/kernel/ptrace.c >> +++ b/arch/arm/kernel/ptrace.c >> @@ -849,7 +849,7 @@ long arch_ptrace(struct task_struct *child, long request, >> #endif >> >> case PTRACE_GET_THREAD_AREA: >> - ret = put_user(task_thread_info(child)->tp_value, >> + ret = put_user(task_thread_info(child)->tp_value[0], >> datap); >> break; > > I'm guessing debuggers don't care about the new TLS value, or do we need a > new ptrace request? I'd say no. In case someone would jump in and say we need it, it'd be a seperate patch anyway. Best regards. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html