On Mon, Dec 10, 2018 at 7:15 PM H.J. Lu <hjl.tools@xxxxxxxxx> wrote: > > On Mon, Dec 10, 2018 at 5:23 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > > > Hi all- > > > > I'm seriously considering sending a patch to remove x32 support from > > upstream Linux. Here are some problems with it: > > > > 1. It's not entirely clear that it has users. As far as I know, it's > > supported on Gentoo and Debian, and the Debian popcon graph for x32 > > has been falling off dramatically. I don't think that any enterprise > > distro has ever supported x32. > > I have been posting x32 GCC results for years: > > https://gcc.gnu.org/ml/gcc-testresults/2018-12/msg01358.html Right. My question wasn't whether x32 had developers -- it was whether it had users. If the only users are a small handful of people who keep the toolchain and working and some people who benchmark it, then I think the case for keeping it in upstream Linux is a bit weak. > > > 2. The way that system calls work is very strange. Most syscalls on > > x32 enter through their *native* (i.e. not COMPAT_SYSCALL_DEFINE) > > entry point, and this is intentional. For example, adjtimex() uses > > the native entry, not the compat entry, because x32's struct timex > > matches the x86_64 layout. But a handful of syscalls have separate > > This becomes less an issue with 64-bit time_t. > > > entry points -- these are the syscalls starting at 512. These enter > > throuh the COMPAT_SYSCALL_DEFINE entry points. > > > > The x32 syscalls that are *not* in the 512 range violate all semblance > > of kernel syscall convention. In the syscall handlers, > > in_compat_syscall() returns true, but the COMPAT_SYSCALL_DEFINE entry > > is not invoked. This is nutty and risks breaking things when people > > refactor their syscall implementations. And no one tests these > > things. Similarly, if someone calls any of the syscalls below 512 but > > sets bit 31 in RAX, then the native entry will be called with > > in_compat_set(). > > > > Conversely, if you call a syscall in the 512 range with bit 31 > > *clear*, then the compat entry is set with in_compat_syscall() > > *clear*. This is also nutty. > > This is to share syscalls between LP64 and ILP32 (x32) in x86-64 kernel. > I tried to understand what's going on. As far as I can tell, most of the magic is the fact that __kernel_long_t and __kernel_ulong_t are 64-bit as seen by x32 user code. This means that a decent number of uapi structures are the same on x32 and x86_64. Syscalls that only use structures like this should route to the x86_64 entry points. But the implementation is still highly dubious -- in_compat_syscall() will be *true* in such system calls, which means that, if someone changes: SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr) { /* x32 goes here, but it's entirely non-obvious unless you read the x86 syscall table */ native impl; } COMPAT_SYSCALL_DEFINE1(some_func, struct compat_some_struct __user *, ptr) { compat impl; } to the Obviously Equivalent (tm): SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr) { struct some_struct kernel_val; if (in_compat_syscall()) { get_compat_some_struct(&kernel_val, ptr); } else { copy_from_user(&kernel_val, ptr, sizeof(struct some_struct)); } do the work; } then x32 breaks. And I don't even know how x32 is supposed to support some hypothetical syscall like this: long sys_nasty(struct adjtimex *a, struct iovec *b); where one argument has x32 and x86_64 matching but the other has x32 and x86_32 matching. This whole thing seems extremely fragile.