Catalin Marinas <catalin.marinas@xxxxxxx> writes: > Hi Eric, > > On Thu, Dec 12, 2019 at 12:26:41PM -0600, Eric W. Biederman wrote: >> Arnd Bergmann <arnd@xxxxxxxx> writes: >> > On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: >> >> >> >> From: Vincenzo Frascino <vincenzo.frascino@xxxxxxx> >> >> >> >> Add MTE-specific SIGSEGV codes to siginfo.h. >> >> >> >> Note that the for MTE we are reusing the same SPARC ADI codes because >> >> the two functionalities are similar and they cannot coexist on the same >> >> system. >> >> Please Please Please don't do that. >> >> It is actively harmful to have architecture specific si_code values. >> As it makes maintenance much more difficult. >> >> Especially as the si_codes are part of union descrimanator. >> >> If your functionality is identical reuse the numbers otherwise please >> just select the next numbers not yet used. > > It makes sense. > >> We have at least 256 si_codes per signal 2**32 if we really need them so >> there is no need to be reuse numbers. >> >> The practical problem is that architecture specific si_codes start >> turning kernel/signal.c into #ifdef soup, and we loose a lot of >> basic compile coverage because of that. In turn not compiling the code >> leads to bit-rot in all kinds of weird places. > > Fortunately for MTE we don't need to change kernel/signal.c. It's > sufficient to call force_sig_fault() from the arch code with the > corresponding signo, code and fault address. Hooray for force_sig_fault at keeping people honest about which parameters they are passing. So far it looks like it is just BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, and SEGV_PKUERR that are the really confusing ones, as they go beyond the ordinary force_sig_fault layout. But we really do need the knowledge of how all of the cases are encoded or things can get very confusing. Especially when mixing 32bit and 64bit code. >> p.s. As for coexistence there is always the possibility that one chip >> in a cpu family does supports one thing and another chip in a cpu >> family supports another. So userspace may have to cope with the >> situation even if an individual chip doesn't. >> >> I remember a similar case where sparc had several distinct page table >> formats and we had a single kernel that had to cope with them all. > > We have such fun on ARM as well with the big.LITTLE systems where not > all CPUs support the same features. For example, MTE is only enabled > once all the secondary CPUs have booted and confirmed to have the > feature. Which all makes it possible that the alternative to MTE referenced as ADI might show up in some future ARM chip. Which really makes reusing the numbers a bad idea. Not that I actually recall what any of this functionality actually is, but I can tell when people are setting themselves of for a challenge unnecessarily. Eric