Marco Elver <elver@xxxxxxxxxx> writes: > Hello, Eric, > > By inspecting the logs I've seen that about 3 years ago there had been a > number of siginfo_t cleanups. This included moving si_addr_lsb: > > b68a68d3dcc1 ("signal: Move addr_lsb into the _sigfault union for clarity") > 859d880cf544 ("signal: Correct the offset of si_pkey in struct siginfo") > 8420f71943ae ("signal: Correct the offset of si_pkey and si_lower in struct siginfo on m68k") > > In an ideal world, we could just have si_addr + the union in _sigfault, > but it seems there are more corner cases. :-/ > > The reason I've stumbled upon this is that I wanted to add the just > merged si_perf [1] field to glibc. But what I noticed is that glibc's > definition and ours are vastly different around si_addr_lsb, si_lower, > si_upper, and si_pkey. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=42dec9a936e7696bea1f27d3c5a0068cd9aa95fd > > In our current definition of siginfo_t, si_addr_lsb is placed into the > same union as si_lower, si_upper, and si_pkey (and now si_perf). From > the logs I see that si_lower, si_upper, and si_pkey are padded because > si_addr_lsb used to be outside the union, which goes back to > "signal: Move addr_lsb into the _sigfault union for clarity". > > Since then, si_addr_lsb must also be pointer-aligned, because the union > containing it must be pointer-aligned (because si_upper, si_lower). On > all architectures where si_addr_lsb is right after si_addr, this is > perfectly fine, because si_addr itself is a pointer... > > ... except for the anomaly that are 64-bit architectures that define > __ARCH_SI_TRAPNO and want that 'int si_trapno'. Like, for example > sparc64, which means siginfo_t's ABI has been subtly broken on sparc64 > since v4.16. > > The following static asserts illustrate this: > > --- a/arch/sparc/kernel/signal_64.c > +++ b/arch/sparc/kernel/signal_64.c > @@ -556,3 +556,37 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0, unsigned long > user_enter(); > } > > +static_assert(offsetof(siginfo_t, si_signo) == 0); > +static_assert(offsetof(siginfo_t, si_errno) == 4); > +static_assert(offsetof(siginfo_t, si_code) == 8); > +static_assert(offsetof(siginfo_t, si_pid) == 16); > +static_assert(offsetof(siginfo_t, si_uid) == 20); > +static_assert(offsetof(siginfo_t, si_tid) == 16); > +static_assert(offsetof(siginfo_t, si_overrun) == 20); > +static_assert(offsetof(siginfo_t, si_status) == 24); > +static_assert(offsetof(siginfo_t, si_utime) == 32); > +static_assert(offsetof(siginfo_t, si_stime) == 40); > +static_assert(offsetof(siginfo_t, si_value) == 24); > +static_assert(offsetof(siginfo_t, si_int) == 24); > +static_assert(offsetof(siginfo_t, si_ptr) == 24); > +static_assert(offsetof(siginfo_t, si_addr) == 16); > +static_assert(offsetof(siginfo_t, si_trapno) == 24); > +#if 1 /* Correct offsets, obtained from v4.14 */ > +static_assert(offsetof(siginfo_t, si_addr_lsb) == 28); > +static_assert(offsetof(siginfo_t, si_lower) == 32); > +static_assert(offsetof(siginfo_t, si_upper) == 40); > +static_assert(offsetof(siginfo_t, si_pkey) == 32); > +#else /* Current offsets, as of v4.16 */ > +static_assert(offsetof(siginfo_t, si_addr_lsb) == 32); > +static_assert(offsetof(siginfo_t, si_lower) == 40); > +static_assert(offsetof(siginfo_t, si_upper) == 48); > +static_assert(offsetof(siginfo_t, si_pkey) == 40); > +#endif > +static_assert(offsetof(siginfo_t, si_band) == 16); > +static_assert(offsetof(siginfo_t, si_fd) == 20); > > --- > > Granted, nobody seems to have noticed because I don't even know if these > fields have use on sparc64. But I don't yet see this as justification to > leave things as-is... > > The collateral damage of this, and the acute problem that I'm having is > defining si_perf in a sort-of readable and portable way in siginfo_t > definitions that live outside the kernel, where sparc64 does not yet > have broken si_addr_lsb. And the same difficulty applies to the kernel > if we want to unbreak sparc64, while not wanting to move si_perf for > other architectures. > > There are 2 options I see to solve this: > > 1. Make things simple again. We could just revert the change moving > si_addr_lsb into the union, and sadly accept we'll have to live with > that legacy "design" mistake. (si_perf stays in the union, but will > unfortunately change its offset for all architectures... this one-off > move might be ok because it's new.) > > 2. Add special cases to retain si_addr_lsb in the union on architectures > that do not have __ARCH_SI_TRAPNO (the majority). I have added a > draft patch that would do this below (with some refactoring so that > it remains sort-of readable), as an experiment to see how complicated > this gets. > > Which option do you prefer? Are there better options? Personally the most important thing to have is a single definition shared by all architectures so that we consolidate testing. A little piece of me cries a little whenever I see how badly we implemented the POSIX design. As specified by POSIX the fields can be place in siginfo such that 32bit and 64bit share a common definition. Unfortunately we did not addpadding after si_addr on 32bit to accommodate a 64bit si_addr. I find it unfortunate that we are adding yet another definition that requires translation between 32bit and 64bit, but I am glad that at least the translation is not architecture specific. That common definition is what has allowed this potential issue to be caught and that makes me very happy to see. Let's go with Option 3. Confirm BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR are not in use on any architecture that defines __ARCH_SI_TRAPNO, and then fixup the userspace definitions of these fields. To the kernel I would add some BUILD_BUG_ON's to whatever the best maintained architecture (sparc64?) that implements __ARCH_SI_TRAPNO just to confirm we don't create future regressions by accident. I did a quick search and the architectures that define __ARCH_SI_TRAPNO are sparc, mips, and alpha. All have 64bit implementations. A further quick search shows that none of those architectures have faults that use BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR, nor do they appear to use mm/memory-failure.c So it doesn't look like we have an ABI regression to fix. Eric