On Thu, Oct 22, 2020 at 10:00:44AM -0700, Nick Desaulniers wrote: > On Thu, Oct 22, 2020 at 9:40 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Thu, Oct 22, 2020 at 04:35:17PM +0000, David Laight wrote: > > > Wait... > > > readv(2) defines: > > > ssize_t readv(int fd, const struct iovec *iov, int iovcnt); > > > > It doesn't really matter what the manpage says. What does the AOSP > > libc header say? > > Same: https://android.googlesource.com/platform/bionic/+/refs/heads/master/libc/include/sys/uio.h#38 > > Theoretically someone could bypass libc to make a system call, right? > > > > > > But the syscall is defined as: > > > > > > SYSCALL_DEFINE3(readv, unsigned long, fd, const struct iovec __user *, vec, > > > unsigned long, vlen) > > > { > > > return do_readv(fd, vec, vlen, 0); > > > } > > > FWIW, glibc makes the readv() syscall assuming that fd and vlen are 'int' as well. So this problem isn't specific to Android's libc. >From objdump -d /lib/x86_64-linux-gnu/libc.so.6: 00000000000f4db0 <readv@@GLIBC_2.2.5>: f4db0: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax f4db7: 00 f4db8: 85 c0 test %eax,%eax f4dba: 75 14 jne f4dd0 <readv@@GLIBC_2.2.5+0x20> f4dbc: b8 13 00 00 00 mov $0x13,%eax f4dc1: 0f 05 syscall ... There's some code for pthread cancellation, but no zeroing of the upper half of the fd and vlen arguments, which are in %edi and %edx respectively. But the glibc function prototype uses 'int' for them, not 'unsigned long' 'ssize_t readv(int fd, const struct iovec *iov, int iovcnt);'. So the high halves of the fd and iovcnt registers can contain garbage. Or at least that's what gcc (9.3.0) and clang (9.0.1) assume; they both compile the following void g(unsigned int x); void f(unsigned long x) { g(x); } into f() making a tail call to g(), without zeroing the top half of %rdi. Also note the following program succeeds on Linux 5.9 on x86_64. On kernels that have this bug, it should fail. (I couldn't get it to actually fail, so it must depend on the compiler and/or the kernel config...) #include <fcntl.h> #include <stdio.h> #include <sys/syscall.h> #include <sys/uio.h> #include <unistd.h> int main() { int fd = open("/dev/zero", O_RDONLY); char buf[1000]; struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) }; long ret; ret = syscall(__NR_readv, fd, &iov, 0x100000001); if (ret < 0) perror("readv failed"); else printf("read %ld bytes\n", ret); } I think the right fix is to change the readv() (and writev(), etc.) syscalls to take 'unsigned int' rather than 'unsigned long', as that is what the users are assuming... - Eric