getcpu() returns EFAULT when called via the vdso

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



sched_getcpu() on my ia64 systems is failing with EFAULT.  simple test code:

$ cat test.c
#define _GNU_SOURCE
#include <errno.h>
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <sys/syscall.h>
main() {
	int i, e;
	puts("");
close(444);
	kill(0, 0);
close(333);
	syscall(__NR_getcpu, &i, 0, 0);
close(123);
	errno = 0;
	i = sched_getcpu();
	e = errno;
close(321);
	printf("getcpu() = %i: %s\n", i, strerror(e));
}
the puts() is to force initialization of some internal libc stuff so the strace 
output is easier to read later on.  the close() calls make it easy to pick out 
the different steps.

when i run this with newer glibc versions (like 2.13+), i get:
	getcpu() = -1: Bad address

running it through strace, we see:
close(444)                              = -1 EBADF (Bad file descriptor)
kill(0, SIG_0)                          = 0
close(333)                              = -1 EBADF (Bad file descriptor)
getcpu([1], NULL, 0)                    = 0
close(123)                              = -1 EBADF (Bad file descriptor)
close(321)                              = -1 EBADF (Bad file descriptor)
write(1, "\ngetcpu() = -1: Bad address\n", 28) = 28

you can see the syscall() working, but the sched_getcpu() doesn't seem to make 
it into supervisor mode.  glibc internally uses the vdso for doing most 
syscalls (while the syscall() func sticks to the old break method).  since we 
know kill() and sched_getcpu() use the vdso, we know strace can handle both 
styles fine since kill() gets decoded.  so that leaves something funky.

the sched_getcpu() code is somewhat simple:
int sched_getcpu (void) {
	unsigned int cpu;
	int r = INLINE_SYSCALL (getcpu, 3, &cpu, NULL, NULL);
	return r == -1 ? r : cpu;
}
so it passes in a pointer to an int on the stack, and 2 null pointers ...

the disassembly of the sched_getcpu() code at runtime looks like:
   0x2000000000203940 <+0>:     [MMI]       alloc r32=ar.pfs,9,1,0
   0x2000000000203941 <+1>:                 adds r14=8,r13
   0x2000000000203942 <+2>:                 mov r33=r12;;
   0x2000000000203950 <+16>:    [MMI]       ld8 r14=[r14]
   0x2000000000203951 <+17>:                nop.m 0x0
   0x2000000000203952 <+18>:                mov r15=1304
   0x2000000000203960 <+32>:    [MII]       mov r35=r0
   0x2000000000203961 <+33>:                mov r34=r0;;
   0x2000000000203962 <+34>:                mov b7=r14;;
   0x2000000000203970 <+48>:    [MIB]       nop.m 0x0
   0x2000000000203971 <+49>:                nop.i 0x0
   0x2000000000203972 <+50>:                br.call.sptk.many b6=b7;;

so we see:
 - b7 gets loaded with a pointer to the __kernel_syscall_via_epc entry point
 - r15 gets the right syscall number (1304 for __NR_getcpu)
 - r33 is a pointer to the stack (gdb shows it is in the $sp region)
 - r34 and r35 get zeroed out

yet once i step over the call to __kernel_syscall_via_epc, i see r8 is set to 
14 (EFAULT).  i can't see that value being setup in kernel/gate.S, but my 
knowledge of ia64 assembly isn't that great, nor the kernel paths, so i'm 
hoping someone can point out the obvious to me here.

i've tested linux 3.0.6 and 3.1.6, glibc 2.13 and 2.15/2.16, and gcc 4.5.3 
(just what i have access to).  they all behave the same.
-mike

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Linux Kernel]     [Sparc Linux]     [DCCP]     [Linux ARM]     [Yosemite News]     [Linux SCSI]     [Linux x86_64]     [Linux for Ham Radio]

  Powered by Linux