On Mon, Jun 07, 2010 at 05:28:51PM +1200, Simon Kitching wrote: > On Sun, 2010-06-06 at 13:49 +0530, Joel Fernandes wrote: > > On Sun, Jun 6, 2010 at 3:06 AM, Vimal <j.vimal@xxxxxxxxx> wrote: > > > Hi Joel, > > > > > >> > > >> now i have a question, even if they share the same vm address space - > > >> they definitely can't share the userspace stack, i'm sure the kernel > > >> would have to reset it up in the same address space of the group of > > >> threads but I really don't know how that works - could you share your > > >> thoughts on this? > > > > > > I get what you're telling. I don't know how it's implemented, but it > > > would be easier to think of it this way: A thread control block has > > > pointers to the process control block (which contains the list of fds, > > > sighands, the page tables, etc.), and the thread context (the set of > > > general purpose registers). So two threads of the same process will > > > have the same PCB but different TCB. So when a context switch occurs > > > between threads of different PCBs, it's a context switch to a > > > different process. > > > > I know we're digressing slightly from the original topic a bit a bit > > I'm just curious to know how/where the new user-mode stack is setup > > for a thread that shares the address space of the cloning thread. > > > > It appears that a new user mode stack is setup in the load_binary > > function which is called during exec() . But I don't see where the new > > stack for the thread is created during a clone() with CLONE_VM flag > > set. > > > > During clone, the address spaces are shared , task_struct and > > thread_info are copied, a new kernel mode stack is created, but I > > don't see where a new userspace stack for the new thread is created. > > neither is the stack pointer value changed in the new task_struct > > (process control block), it is simply a copy of the process that > > cloned/forked. > > > > I wonder if this is setup in userspace itself by a thread library? > > Have you tried "man clone"? > > This states that the clone() api is: > int clone(int (*fn)(void *), void *child_stack, > int flags, void *arg, ... > > > Param child_stack is something that the *caller* of the clone() function > must allocate. Presumably this is done via a malloc() or similar, ie the > new thread's stack normally lives within the heap of the parent process. Oh I missed that. very cool, got my answer :) Also I asked the same question on usenet (comp.arch.systems.development.linux) and this is what David Shwartz says: " Exactly. Before calling 'clone', user space sets up an extra stack. On return, the newly-created thread sets its scheduling parameters and so on, switches to the newly-created stack, and jumps to its start function. You can find all the code to do this in glibc's 'nptl' directory. Start from sysdeps/pthread/createthread.c " > Hmm..wonder how stack-overflow detection is done for threads.. > > File include/asm-generic/unistd.h defines syscalls, mapping clone to > sys_clone [1][2]. The sys_clone implementation can be found in an > arch-specific file, eg arch/x86/kernel/process.c. This then just > forwards to do_fork() in kernel/fork.c, where the new stack is passed as > an input param. > > do_fork() calls copy_process() which creates a new task_struct with > dup_task_struct(current), then stores the stack start address into the > task_struct. So when the scheduler switches to this newly-created > thread's task_struct, that new stack address gets loaded into the > stack-pointer register. Actually that stack is the per-process kernel stack. dup_task_struct creates a new kernel stack, this can be verified in code: dup_task_struct calls alloc_thread_info which allocates a page for the kernel stack , since stack grows downward, the bottom of stack is set to threadinfo by the type casting the allocated page: (struct thread_info *)__get_free_pages(mask, THREAD_SIZE_ORDER); > You talk about "kernel mode" and "user mode" stacks, but AFAIK there is > no such distinction. I'm pretty sure that when a syscall occurs, the > kernel code that executes runs on the calling thread's stack. There are > things called "kernel threads" which do background tasks for the kernel, > but they are never visible to user-space. Are you sure because I'm sure every process has a kernel mode stack and when you syscall into the kernel, the limited-sized kernel mode stack is switched into.. You may not possibly use the userspace stack for kernel code because the stack is in a different address space and might not even be mapped completely and the kernel code can't fault while accessing memory from what I know The kernel threads you are referring to execute in kernel mode and use their corresponding kernel mode stack AFAIK. > I'm not sure what you mean by CLONE_VM above. If the VM is being cloned, > then this is a traditional "fork process" call, and so the child can > have exactly the same stack-pointer as the parent, because it is running > with (effectively) a copy of the parent's memory. In this case, the > stack_address param to clone can be zero which causes the parent's > stack-pointer to be used. But you were asking about "a thread that > shares the address space of the cloning thread", so CLONE_VM is not set. err, no CLONE_VM is set when the calling process and the new thread share the same address space which is the case for threads. The create_thread function which creates a new thread in the NPTL thread libary passes CLONE_VM for the clone system call: http://bit.ly/ctgAc6 > By the way, why are you trying to read from a thread's stack? I dont, Actually that question was posted by someone else, I just poked my noise and raised a few more questions :) > [1] The parameters to sys_clone don't quite match the clone() api. It > seems that libc partially handles clone() in user-space; in the libc > source I found file "sysdeps/unix/sysv/linux/i386/clone.S" which I can't > really understand, as my x86 assembly code knowledge is near zero. > However the file has these comments: > /* clone() is even more special than fork() as it mucks with stacks > and invokes a function in the right context after its all over. */ > and > /* Save the function pointer as the zeroth argument. > It will be popped off in the child in the ebx frobbing below. */ > which suggest that this is why the sys_clone() prototype doesn't exactly > match the clone() prototype. Very cool observation I think > [2] unistd.h seems to be declaring the clone syscall# as 220: > #define __NR_clone 220 > __SYSCALL(__NR_clone, sys_clone) /* .long sys_clone_wrapper */ > but libc's clone.S seems to be using 120: > #define __NR_clone 120 > #define SYS_clone 120 > although oddly nothing appears to ever use these #defines. Can anyone > explain this mismatch? I'm not sure which unistd.h you were referring to but for 32 bit x86, unistd.h defines __NR_clone as 120. I checked /usr/include/asm/unistd_32.h and /usr/src/linux-`uname -r`/arch/x86/include/asm/unistd_32.h About the syscall macro for clone, I checked clone.S and the SYS_CLONE is accessed indirectly like so before issuing the syscall interrupt: movl $SYS_ify(clone),%eax SYS_ify is defined in sysdeps/unix/sysv/linux/i386/sysdep.h as: #define SYS_ify(syscall_name) __NR_##syscall_name take care, Joel -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ