Re: Can we drop upstream Linux x32 support?

Andy Lutomirski <luto@xxxxxxxxxx> · Mon, 10 Dec 2018 21:35:02 -0800

On Mon, Dec 10, 2018 at 7:15 PM H.J. Lu <hjl.tools@xxxxxxxxx> wrote:
>
> On Mon, Dec 10, 2018 at 5:23 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> > Hi all-
> >
> > I'm seriously considering sending a patch to remove x32 support from
> > upstream Linux.  Here are some problems with it:
> >
> > 1. It's not entirely clear that it has users.  As far as I know, it's
> > supported on Gentoo and Debian, and the Debian popcon graph for x32
> > has been falling off dramatically.  I don't think that any enterprise
> > distro has ever supported x32.
>
> I have been posting x32 GCC results for years:
>
> https://gcc.gnu.org/ml/gcc-testresults/2018-12/msg01358.html

Right.  My question wasn't whether x32 had developers -- it was
whether it had users.  If the only users are a small handful of people
who keep the toolchain and working and some people who benchmark it,
then I think the case for keeping it in upstream Linux is a bit weak.

>
> > 2. The way that system calls work is very strange.  Most syscalls on
> > x32 enter through their *native* (i.e. not COMPAT_SYSCALL_DEFINE)
> > entry point, and this is intentional.  For example, adjtimex() uses
> > the native entry, not the compat entry, because x32's struct timex
> > matches the x86_64 layout.  But a handful of syscalls have separate
>
> This becomes less an issue with 64-bit time_t.
>
> > entry points -- these are the syscalls starting at 512.  These enter
> > throuh the COMPAT_SYSCALL_DEFINE entry points.
> >
> > The x32 syscalls that are *not* in the 512 range violate all semblance
> > of kernel syscall convention.  In the syscall handlers,
> > in_compat_syscall() returns true, but the COMPAT_SYSCALL_DEFINE entry
> > is not invoked.   This is nutty and risks breaking things when people
> > refactor their syscall implementations.  And no one tests these
> > things.  Similarly, if someone calls any of the syscalls below 512 but
> > sets bit 31 in RAX, then the native entry will be called with
> > in_compat_set().
> >
> > Conversely, if you call a syscall in the 512 range with bit 31
> > *clear*, then the compat entry is set with in_compat_syscall()
> > *clear*.  This is also nutty.
>
> This is to share syscalls between LP64 and ILP32 (x32) in x86-64 kernel.
>

I tried to understand what's going on.  As far as I can tell, most of
the magic is the fact that __kernel_long_t and __kernel_ulong_t are
64-bit as seen by x32 user code.  This means that a decent number of
uapi structures are the same on x32 and x86_64.  Syscalls that only
use structures like this should route to the x86_64 entry points.  But
the implementation is still highly dubious -- in_compat_syscall() will
be *true* in such system calls, which means that, if someone changes:

SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr)
{
  /* x32 goes here, but it's entirely non-obvious unless you read the
x86 syscall table */
  native impl;
}

COMPAT_SYSCALL_DEFINE1(some_func, struct compat_some_struct __user *, ptr)
{
  compat impl;
}

to the Obviously Equivalent (tm):

SYSCALL_DEFINE1(some_func, struct some_struct __user *, ptr)
{
  struct some_struct kernel_val;
  if (in_compat_syscall()) {
    get_compat_some_struct(&kernel_val, ptr);
  } else {
    copy_from_user(&kernel_val, ptr, sizeof(struct some_struct));
  }
  do the work;
}

then x32 breaks.

And I don't even know how x32 is supposed to support some hypothetical
syscall like this:

long sys_nasty(struct adjtimex *a, struct iovec *b);

where one argument has x32 and x86_64 matching but the other has x32
and x86_32 matching.

This whole thing seems extremely fragile.