> On Dec 11, 2018, at 3:35 PM, Thorsten Glaser <tg@xxxxxxxxx> wrote: > > Andy Lutomirski dixit: > >> What happens if someone adds a struct like: >> >> struct nasty_on_x32 { >> __kernel_long_t a; >> void * __user b; >> }; >> >> On x86_64, that's two 8-byte fields. On x86_32, it's two four-byte >> fields. On x32, it's an 8-byte field and a 4-byte field. Now what? > > Yes, that’s indeed ugly. I understand. But don’t we already have > this problem with architectures which support multiple ABIs at the > same time? An amd64 kernel with i386 userspace comes to mind, or > the multiple MIPS ABIs. That’s the thing, though: the whole generic kernel compat infrastructure assumes there are at most two ABIs: native and, if enabled and relevant, compat. x32 breaks this entirely. > >> I'm sure we could have some magic gcc plugin or other nifty tool that >> gives us: >> >> copy_from_user(struct struct_name, kernel_ptr, user_ptr); > > Something like that might be useful. Generate call stubs, which > then call the syscall implementation with the actual user-space > struct contents as arguments. Hm, that might be too generic to > be useful. Generate macros that can read from or write specific > structures to userspace? > > I think something like this could solve other more general problems > as well, so it might be “nice to have anyway”. Of course it’s work, > and I’m not involved enough in Linux kernel programming to be able > to usefully help with it (doing too much elsewhere already). > >> actually do this work. Instead we get ad hoc fixes for each syscall, >> along the lines of preadv64v2(), which get done when somebody notices > > Yes, that’s absolutely ugly and ridiculous and all kinds of bad. > > On the other hand, from my current experience, someone (Arnd?) noticed > all the currently existing baddies for x32 already and fixed them. > > New syscalls are indeed an issue, but perhaps something generating > copyinout stubs could help. This might allow other architectures > that could do with a new ABI but have until now feared the overhead > as well. (IIRC, m68k could do with a new ABI that reserves a register > for TLS, but Geert would know. At the same time, time_t and off_t could > be bumped to 64 bit. Something like that. If changing sizes of types > shared between kernel and user spaces is not something feared…) Magic autogenerated stubs would be great. Difficult, too, given unions, multiplexers, cmsg, etc. I suppose I will see how bad it would be to split out the x32 syscall table and at least isolate the mess to some extent. IMO the real right solution would be to push the whole problem to userspace: get an ILP32 system working with almost or entirely LP64 syscalls. POSIX support might have to be a bit flexible, but still. How hard would it be to have __attribute__((ilp64)), with an optional warning if any embedded structs are not ilp64? This plus a wrapper to make sure that mmap puts everything below 4GB ought to do the trick. Or something like what arm64 is proposing where the kernel ABI has 32-bit long doesn’t seem too horrible.