Re: Question about execve.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 28, 2010 at 1:32 PM, John David Anglin
<dave@xxxxxxxxxxxxxxxxxx> wrote:
>> I tried the vfork.c test case on my c3750 with 32-bit kernel.  It
>> didn't segv in a limited number of runs.  However, I did notice that
>> getpid() is broken after vfork().
>
> The vfork (clone) syscall corrupts (i.e., inserts wrong value)
> the parent tid.  In the following, I disabled the printf's and
> execve call in Carlos's testcase.  The child just does _exit().

The vfork syscall goes through process.c (sys_vfork) which doesn't
tell the kernel about the parent_tidptr, or child_tidptr (passes NULL
for both).

The kernel shouldn't be touching the parent tid pointer at all.

The vfork wrapper in glibc *does* negate the cached PID value, such
that the child doesn't see a valid PID value until after execve
completes.

> The fast path through getpid is:
>
> Dump of assembler code for function getpid:
> 0x0001ad2c <getpid+0>:  stw rp,-14(sp)
> 0x0001ad30 <getpid+4>:  mfctl tr3,r20
> 0x0001ad34 <getpid+8>:  ldw -414(r20),r19
> 0x0001ad38 <getpid+12>: cmpib,>= 0,r19,0x1ad48 <getpid+28>
> 0x0001ad3c <getpid+16>: copy r19,ret0
> 0x0001ad40 <getpid+20>: ldw -14(sp),rp
> 0x0001ad44 <getpid+24>: bv,n r0(rp)
>
> Breakpoint 3, 0x0001ad34 in getpid ()
> (gdb) del 2
> (gdb) p/x $r20
> $6 = 0x9a480
>
> Breakpoint 4, main () at vfork.c:17
> 17        child = vfork();
> (gdb) x/x 0x9a480 - 0x414
> 0x9a06c:        0x00000000
> (gdb) c
> Continuing.

This is the PID of the parent, not the TID. They are actually two
different fields.

nptl/descr.h
~~~
  /* Thread ID - which is also a 'is this thread descriptor (and
     therefore stack) used' flag.  */
  pid_t tid;

  /* Process ID - thread group ID in kernel speak.  */
  pid_t pid;
~~~

The PID of all the threads in a process group is the same.

Each thread has a unique TID, which is

During the vfork the parent does this:
~~~
 /* Load thread register. */
 mfctl %cr27, %r26 !
 /* Load cached parent PID. */
 ldw -1044(%r26),%r1 !
 /* Negate it, such that the child runs with
    a negative PID and no functions work until
    the execve. */
 sub %r0,%r1,%r1 !
 /* Store it back. */
 stw %r1,-1044(%r26) !
~~~

> Breakpoint 3, 0x0001ad34 in getpid ()
> (gdb) x/x 0x9a480 - 0x414
> 0x9a06c:        0x10101364

I don't see how this is the negative of 0x0, it should just be 0x0. I
wonder what changed it.

> Breakpoint 4, main () at vfork.c:17
> 17        child = vfork();
> (gdb) x/64x 0x9a040
> 0x9a040:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a050:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a060:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a070:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a080:        0xc0521150      0x00000000      0x00000000      0x00000000
>
> Breakpoint 3, 0x0001ad34 in getpid ()
> (gdb) x/64x 0x9a040
> 0x9a040:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a050:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a060:        0x00000000      0x00000000      0x00000000      0x10101364
> 0x9a070:        0x00000000      0x00000000      0x00000000      0x00000000
> 0x9a080:        0xc03ae150      0x00000000      0x00000000      0x00000000
>
> So, the only location changed by vfork is the parent tid.

s/tid/pid/g.

> dave@hiauly6:~$ strace ./vfork
> execve("./vfork", ["./vfork"], [/* 16 vars */]) = 0
> newuname({sys="Linux", node="hiauly6", ...}) = 0
> brk(0)                                  = 0x9a000
> brk(0x9acb4)                            = 0x9acb4
> brk(0xbbcb4)                            = 0xbbcb4
> brk(0xbc000)                            = 0xbc000
> clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x9a068) = 6212
> fstat64(0x1, 0xc0258a08)                = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40000000
> write(1, "parent is 269488996\n", 20parent is 269488996
> )   = 20
> exit_group(0)                           = ?
>
> (gdb) p/x 269488996
> $1 = 0x10101364
>
> I believe this is the "getpid" implementation:
>
> static inline __attribute__((always_inline)) pid_t
> really_getpid (pid_t oldval)
> {
>  if (__builtin_expect (oldval == 0, 1))
>    {
>      pid_t selftid = THREAD_GETMEM (THREAD_SELF, tid);
>      if (__builtin_expect (selftid != 0, 1))
>        return selftid;
>     }
>
>  INTERNAL_SYSCALL_DECL (err);
>  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);
>
>  /* We do not set the PID field in the TID here since we might be
>     called from a signal handler while the thread executes fork.  */
>  if (oldval == 0)
>    THREAD_SETMEM (THREAD_SELF, tid, result);
>  return result;
> }
>
> As a side issue, gdb can't single step over mfctl instruction ;(

We'll fix gdb next :-)

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux