Re: Access thread stack from another thread

Joel Fernandes <agnel.joel@xxxxxxxxx> · Mon, 7 Jun 2010 19:56:27 +0530

On Mon, Jun 07, 2010 at 05:28:51PM +1200, Simon Kitching wrote:
> On Sun, 2010-06-06 at 13:49 +0530, Joel Fernandes wrote:
> > On Sun, Jun 6, 2010 at 3:06 AM, Vimal <j.vimal@xxxxxxxxx> wrote:
> > > Hi Joel,
> > >
> > >>
> > >> now i have a question, even if they share the same vm address space -
> > >> they definitely can't share the userspace stack, i'm sure the kernel
> > >> would have to reset it up in the same address space of the group of
> > >> threads but I really don't know how that works - could you share your
> > >> thoughts on this?
> > >
> > > I get what you're telling.  I don't know how it's implemented, but it
> > > would be easier to think of it this way:  A thread control block has
> > > pointers to the process control block (which contains the list of fds,
> > > sighands, the page tables, etc.), and the thread context (the set of
> > > general purpose registers).  So two threads of the same process will
> > > have the same PCB but different TCB.  So when a context switch occurs
> > > between threads of different PCBs, it's a context switch to a
> > > different process.
> > 
> > I know we're digressing slightly from the original topic a bit a bit
> > I'm just curious to know how/where the new user-mode stack is setup
> > for a thread that shares the address space of the cloning thread.
> > 
> > It appears that a new user mode stack is setup in the load_binary
> > function which is called during exec() . But I don't see where the new
> > stack for the thread is created during a clone() with CLONE_VM flag
> > set.
> > 
> > During clone, the address spaces are shared , task_struct and
> > thread_info are copied, a new kernel mode stack is created, but I
> > don't see where a new userspace stack for the new thread is created.
> > neither is the stack pointer value changed in the new task_struct
> > (process control block), it is simply a copy of the process that
> > cloned/forked.
> > 
> > I wonder if this is setup in userspace itself by a thread library?
> 
> Have you tried "man clone"?
> 
> This states that the clone() api is:
>  int clone(int (*fn)(void *), void *child_stack,
>                  int flags, void *arg, ...
> 
> 
> Param child_stack is something that the *caller* of the clone() function
> must allocate. Presumably this is done via a malloc() or similar, ie the
> new thread's stack normally lives within the heap of the parent process.

Oh I missed that. very cool, got my answer :)
Also I asked the same question on usenet (comp.arch.systems.development.linux)
and this is what David Shwartz says:
"
Exactly. Before calling 'clone', user space sets up an extra stack. On 
return, the newly-created thread sets its scheduling parameters and so 
on, switches to the newly-created stack, and jumps to its start 
function. 
You can find all the code to do this in glibc's 'nptl' directory. 
Start from sysdeps/pthread/createthread.c  "

> Hmm..wonder how stack-overflow detection is done for threads..
> 
> File include/asm-generic/unistd.h defines syscalls, mapping clone to
> sys_clone [1][2]. The sys_clone implementation can be found in an
> arch-specific file, eg arch/x86/kernel/process.c. This then just
> forwards to do_fork() in kernel/fork.c, where the new stack is passed as
> an input param.
> 
> do_fork() calls copy_process() which creates a new task_struct with
> dup_task_struct(current), then stores the stack start address into the
> task_struct. So when the scheduler switches to this newly-created
> thread's task_struct, that new stack address gets loaded into the
> stack-pointer register.

Actually that stack is the per-process kernel stack. dup_task_struct
creates a new kernel stack, this can be verified in code:
dup_task_struct calls alloc_thread_info which allocates a page for the
kernel stack , since stack grows downward, the bottom of stack is set to
threadinfo by the type casting the allocated page:
(struct thread_info *)__get_free_pages(mask, THREAD_SIZE_ORDER);

> You talk about "kernel mode" and "user mode" stacks, but AFAIK there is
> no such distinction. I'm pretty sure that when a syscall occurs, the
> kernel code that executes runs on the calling thread's stack. There are
> things called "kernel threads" which do background tasks for the kernel,
> but they are never visible to user-space. 

Are you sure because I'm sure every process has a kernel mode stack and
when you syscall into the kernel, the limited-sized kernel mode stack is
switched into.. You may not possibly use the userspace stack for kernel
code because the stack is in a different address space and might not
even be mapped completely and the kernel code can't fault while
accessing memory from what I know

The kernel threads you are referring to execute in kernel mode and use
their corresponding kernel mode stack AFAIK.

> I'm not sure what you mean by CLONE_VM above. If the VM is being cloned,
> then this is a traditional "fork process" call, and so the child can
> have exactly the same stack-pointer as the parent, because it is running
> with (effectively) a copy of the parent's memory. In this case, the
> stack_address param to clone can be zero which causes the parent's
> stack-pointer to be used. But you were asking about "a thread that
> shares the address space of the cloning thread", so CLONE_VM is not set.

err, no CLONE_VM is set when the calling process and the new thread
share the same address space which is the case for threads.
The create_thread function which creates a new thread in the NPTL thread
libary passes CLONE_VM for the clone system call: http://bit.ly/ctgAc6

> By the way, why are you trying to read from a thread's stack?

I dont, Actually that question was posted by someone else, I just poked my
noise and raised a few more questions :)

> [1] The parameters to sys_clone don't quite match the clone() api. It
> seems that libc partially handles clone() in user-space; in the libc
> source I found file "sysdeps/unix/sysv/linux/i386/clone.S" which I can't
> really understand, as my x86 assembly code knowledge is near zero.
> However the file has these comments:
>   /* clone() is even more special than fork() as it mucks with stacks
>    and invokes a function in the right context after its all over.  */
> and
>   /* Save the function pointer as the zeroth argument.
>    It will be popped off in the child in the ebx frobbing below.  */
> which suggest that this is why the sys_clone() prototype doesn't exactly
> match the clone() prototype.

Very cool observation I think

> [2] unistd.h seems to be declaring the clone syscall# as 220:
>   #define __NR_clone 220
>   __SYSCALL(__NR_clone, sys_clone)	/* .long sys_clone_wrapper */
> but libc's clone.S seems to be using 120:
>   #define __NR_clone 120
>   #define SYS_clone 120
> although oddly nothing appears to ever use these #defines. Can anyone
> explain this mismatch?

I'm not sure which unistd.h you were referring to but for 32 bit x86, unistd.h
defines __NR_clone as 120.  I checked /usr/include/asm/unistd_32.h and
/usr/src/linux-`uname -r`/arch/x86/include/asm/unistd_32.h

About the syscall macro for clone, I checked clone.S and the SYS_CLONE
is accessed indirectly like so before issuing the syscall interrupt:
movl $SYS_ify(clone),%eax
SYS_ify is defined in sysdeps/unix/sysv/linux/i386/sysdep.h as:
#define SYS_ify(syscall_name) __NR_##syscall_name

take care,
Joel

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ