On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote: > On 1 Nov 2020, at 01:22, Rich Felker <dalias@xxxxxxxx> wrote: > > On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote: > >> cc: some libc folks > >> > >> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@xxxxxxxxxx> wrote: > >>> > >>> POSIX specifies that the first field of the supplied msgp, namely mtype, > >>> is a long, not a __kernel_long_t, and it's a user-defined struct due to > >>> the variable-length mtext field so we can't even bend the spec and make > >>> it a __kernel_long_t even if we wanted to. Thus we must use the compat > >>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and > >>> msgrcv respectively. > >> > >> This is a mess. > >> > >> include/uapi/linux/msg.h has: > >> > >> /* message buffer for msgsnd and msgrcv calls */ > >> struct msgbuf { > >> __kernel_long_t mtype; /* type of message */ > >> char mtext[1]; /* message text */ > >> }; > >> > >> Your test has: > >> > >> struct msg_long { > >> long mtype; > >> char mtext[8]; > >> }; > >> > >> struct msg_long_ext { > >> struct msg_long msg_long; > >> char mext[4]; > >> }; > >> > >> and I'm unclear as to exactly what you're trying to do there with the > >> "mext" part. > >> > >> POSIX says: > >> > >> The application shall ensure that the argument msgp points to a user- > >> defined buffer that contains first a field of type long specifying the > >> type of the message, and then a data portion that holds the data bytes > >> of the message. The structure below is an example of what this user-de‐ > >> fined buffer might look like: > >> > >> struct mymsg { > >> long mtype; /* Message type. */ > >> char mtext[1]; /* Message text. */ > >> } > >> > >> NTP has this delightful piece of code: > >> > >> 44 typedef union { > >> 45 struct msgbuf msgp; > >> 46 struct { > >> 47 long mtype; > >> 48 int code; > >> 49 struct timeval tv; > >> 50 } msgb; > >> 51 } MsgBuf; > >> > >> bluefish has: > >> > >> struct small_msgbuf { > >> long mtype; > >> char mtext[MSQ_QUEUE_SMALL_SIZE]; > >> } small_msgp; > >> > >> > >> My laptop has nothing at all in /dev/mqueue. > >> > >> So I don't really know what the right thing to do is. Certainly if > >> we're going to apply this patch, we should also fix the header. I > >> almost think we should *delete* struct msgbuf from the headers, since > >> it's all kinds of busted, but that will break the NTP build. Ideally > >> we would go back in time and remove it from the headers. > >> > >> Libc people, any insight? We can probably fix the bug without > >> annoying anyone given how lightly x32 is used and how lightly POSIX > >> message queues are used. > > > > If it's that outright wrong and always has been, I feel like the old > > syscall numbers should just be deprecated and new ones assigned. > > Otherwise, there's no way for userspace to be safe against data > > corruption when run on older kernels. If there's a new syscall number, > > libc can just use the new one unconditionally (giving ENOSYS on > > kernels where it would be broken) or have a x32-specific > > implementation that makes the old syscall and performs translation if > > the new one fails with ENOSYS. > > That doesn't really help broken code continue to work reliably, as > upgrading libc will just pull in the new syscall for a binary that's > expecting the broken behaviour, unless you do symbol versioning, but > then it'll just break when you next recompile the code, and there's no > way for that to be diagnosed given the *application* has to define the > type. But given it's application-defined I really struggle to see how > any code out there is actually expecting the current x32 behaviour as > you'd have to go really out of your way to find out that x32 is broken > and needs __kernel_long_t. I don't think there's any way around just > technically breaking ABI whilst likely really fixing ABI in 99.999% of > cases (maybe 100%). I'm not opposed to "breaking ABI" here because the current syscall doesn't work unless someone wrote bogus x32-specific code to work around it being wrong. I don't particularly want to preserve any of the current behavior. What I am somewhat opposed to is making a situation where an updated libc can't be safe against getting run on a kernel with a broken version of the syscall and silently corrupting data. I'm flexible about how avoiding tha tis achieved. Rich