On Thu, 29 Apr 2021 at 19:24, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: [...] > > Granted, nobody seems to have noticed because I don't even know if these > > fields have use on sparc64. But I don't yet see this as justification to > > leave things as-is... > > > > The collateral damage of this, and the acute problem that I'm having is > > defining si_perf in a sort-of readable and portable way in siginfo_t > > definitions that live outside the kernel, where sparc64 does not yet > > have broken si_addr_lsb. And the same difficulty applies to the kernel > > if we want to unbreak sparc64, while not wanting to move si_perf for > > other architectures. > > > > There are 2 options I see to solve this: > > > > 1. Make things simple again. We could just revert the change moving > > si_addr_lsb into the union, and sadly accept we'll have to live with > > that legacy "design" mistake. (si_perf stays in the union, but will > > unfortunately change its offset for all architectures... this one-off > > move might be ok because it's new.) > > > > 2. Add special cases to retain si_addr_lsb in the union on architectures > > that do not have __ARCH_SI_TRAPNO (the majority). I have added a > > draft patch that would do this below (with some refactoring so that > > it remains sort-of readable), as an experiment to see how complicated > > this gets. > > > > Which option do you prefer? Are there better options? > > Personally the most important thing to have is a single definition > shared by all architectures so that we consolidate testing. > > A little piece of me cries a little whenever I see how badly we > implemented the POSIX design. As specified by POSIX the fields can be > place in siginfo such that 32bit and 64bit share a common definition. > Unfortunately we did not addpadding after si_addr on 32bit to > accommodate a 64bit si_addr. I think it's even worse than that, see the fun I had with siginfo last week: https://lkml.kernel.org/r/20210422191823.79012-1-elver@xxxxxxxxxx ... because of the 3 initial ints and no padding after them, we can't portably add __u64 fields to siginfo, and are forever forced to have subtly different behaviour between 32-bit and 64-bit architectures. :-/ > I find it unfortunate that we are adding yet another definition that > requires translation between 32bit and 64bit, but I am glad > that at least the translation is not architecture specific. That common > definition is what has allowed this potential issue to be caught > and that makes me very happy to see. > > Let's go with Option 3. > > Confirm BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR are not > in use on any architecture that defines __ARCH_SI_TRAPNO, and then fixup > the userspace definitions of these fields. > > To the kernel I would add some BUILD_BUG_ON's to whatever the best > maintained architecture (sparc64?) that implements __ARCH_SI_TRAPNO just > to confirm we don't create future regressions by accident. > > I did a quick search and the architectures that define __ARCH_SI_TRAPNO > are sparc, mips, and alpha. All have 64bit implementations. A further > quick search shows that none of those architectures have faults that > use BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR, nor do > they appear to use mm/memory-failure.c > > So it doesn't look like we have an ABI regression to fix. That sounds fine to me -- my guess was that they're not used on these architectures, but I just couldn't make that call. I have patches adding compile-time asserts for sparc64, arm, arm64 ready to go. I'll send them after some more testing. Thanks, -- Marco