On 1 Nov 2020, at 21:01, Rich Felker <dalias@xxxxxxxx> wrote: > > On Sun, Nov 01, 2020 at 06:27:10PM +0000, Jessica Clarke wrote: >> On 1 Nov 2020, at 18:15, Jessica Clarke <jrtc27@xxxxxxxxxx> wrote: >>> >>> On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@xxxxxxxxxx> wrote: >>>> >>>> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@xxxxxxxx> wrote: >>>>> >>>>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote: >>>>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@xxxxxxxx> wrote: >>>>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote: >>>>>>>> cc: some libc folks >>>>>>>> >>>>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@xxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype, >>>>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to >>>>>>>>> the variable-length mtext field so we can't even bend the spec and make >>>>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat >>>>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and >>>>>>>>> msgrcv respectively. >>>>>>>> >>>>>>>> This is a mess. >>>>>>>> >>>>>>>> include/uapi/linux/msg.h has: >>>>>>>> >>>>>>>> /* message buffer for msgsnd and msgrcv calls */ >>>>>>>> struct msgbuf { >>>>>>>> __kernel_long_t mtype; /* type of message */ >>>>>>>> char mtext[1]; /* message text */ >>>>>>>> }; >>>>>>>> >>>>>>>> Your test has: >>>>>>>> >>>>>>>> struct msg_long { >>>>>>>> long mtype; >>>>>>>> char mtext[8]; >>>>>>>> }; >>>>>>>> >>>>>>>> struct msg_long_ext { >>>>>>>> struct msg_long msg_long; >>>>>>>> char mext[4]; >>>>>>>> }; >>>>>>>> >>>>>>>> and I'm unclear as to exactly what you're trying to do there with the >>>>>>>> "mext" part. >>>>>>>> >>>>>>>> POSIX says: >>>>>>>> >>>>>>>> The application shall ensure that the argument msgp points to a user- >>>>>>>> defined buffer that contains first a field of type long specifying the >>>>>>>> type of the message, and then a data portion that holds the data bytes >>>>>>>> of the message. The structure below is an example of what this user-de‐ >>>>>>>> fined buffer might look like: >>>>>>>> >>>>>>>> struct mymsg { >>>>>>>> long mtype; /* Message type. */ >>>>>>>> char mtext[1]; /* Message text. */ >>>>>>>> } >>>>>>>> >>>>>>>> NTP has this delightful piece of code: >>>>>>>> >>>>>>>> 44 typedef union { >>>>>>>> 45 struct msgbuf msgp; >>>>>>>> 46 struct { >>>>>>>> 47 long mtype; >>>>>>>> 48 int code; >>>>>>>> 49 struct timeval tv; >>>>>>>> 50 } msgb; >>>>>>>> 51 } MsgBuf; >>>>>>>> >>>>>>>> bluefish has: >>>>>>>> >>>>>>>> struct small_msgbuf { >>>>>>>> long mtype; >>>>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE]; >>>>>>>> } small_msgp; >>>>>>>> >>>>>>>> >>>>>>>> My laptop has nothing at all in /dev/mqueue. >>>>>>>> >>>>>>>> So I don't really know what the right thing to do is. Certainly if >>>>>>>> we're going to apply this patch, we should also fix the header. I >>>>>>>> almost think we should *delete* struct msgbuf from the headers, since >>>>>>>> it's all kinds of busted, but that will break the NTP build. Ideally >>>>>>>> we would go back in time and remove it from the headers. >>>>>>>> >>>>>>>> Libc people, any insight? We can probably fix the bug without >>>>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX >>>>>>>> message queues are used. >>>>>>> >>>>>>> If it's that outright wrong and always has been, I feel like the old >>>>>>> syscall numbers should just be deprecated and new ones assigned. >>>>>>> Otherwise, there's no way for userspace to be safe against data >>>>>>> corruption when run on older kernels. If there's a new syscall number, >>>>>>> libc can just use the new one unconditionally (giving ENOSYS on >>>>>>> kernels where it would be broken) or have a x32-specific >>>>>>> implementation that makes the old syscall and performs translation if >>>>>>> the new one fails with ENOSYS. >>>>>> >>>>>> That doesn't really help broken code continue to work reliably, as >>>>>> upgrading libc will just pull in the new syscall for a binary that's >>>>>> expecting the broken behaviour, unless you do symbol versioning, but >>>>>> then it'll just break when you next recompile the code, and there's no >>>>>> way for that to be diagnosed given the *application* has to define the >>>>>> type. But given it's application-defined I really struggle to see how >>>>>> any code out there is actually expecting the current x32 behaviour as >>>>>> you'd have to go really out of your way to find out that x32 is broken >>>>>> and needs __kernel_long_t. I don't think there's any way around just >>>>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of >>>>>> cases (maybe 100%). >>>>> >>>>> I'm not opposed to "breaking ABI" here because the current syscall >>>>> doesn't work unless someone wrote bogus x32-specific code to work >>>>> around it being wrong. I don't particularly want to preserve any of >>>>> the current behavior. >>>>> >>>>> What I am somewhat opposed to is making a situation where an updated >>>>> libc can't be safe against getting run on a kernel with a broken >>>>> version of the syscall and silently corrupting data. I'm flexible >>>>> about how avoiding tha tis achieved. >>>> >>>> If we're sufficiently confident that we won't regress anything by >>>> fixing the bug, I propose we do the following. First, we commit a fix >>>> that's Jessica's patch plus a fix to struct msghdr, and we mark that >>>> for -stable. Then we commit another patch that removes 'struct >>>> msghdr' from uapi entirely, but we don't mark that for -stable. If >>>> people complain about the latter, we revert it. >>> >>> Thinking about this more, MIPS n32 is also affected by that header. In >>> fact the n32 syscalls currently do the right thing and use the compat >>> implementations, so the header is currently out-of-sync with the kernel >>> there*. This should be noted when committing the change to msg.h. >> >> Never mind, it seems MIPS n32 is weird and leaves __kernel_long_t as a >> normal long despite being an ILP32-on-64-bit ABI, I guess because it's >> inherited from IRIX rather than being invented by the GNU world. > > Yes, the whole __kernel_long_t invention is largely x32-only (maybe > theoretically on aarch64-ilp32 too? if that even really exists?) and > is pretty much entirely a mistake from lacking the proper > infrastructure to do time64 when x32 was introduced (note that n32 has > 32-bit old-time_t). I hope effort will be made to keep the same > mistake from creeping into future ilp32-on-64 ABIs if there are any. Ping? Does anyone have further thoughts/are people happy for this to land? Jess