On Sat, 15 Dec 2018, Rich Felker wrote: > > A possibly nicer way to accomplish more or less the same thing would > > be to allocate the area with _install_special_mapping() and arrange to > > keep a reference to the struct page around. > > > > The really nice but less compatible fix would be to let processes or > > even the whole system opt out by promising not to put anything in FPU > > branch delay slots, of course. > > As I noted on Twitter when Mudge brought this topic back up, there's a > much more compatible, elegant, and safe fix possible that does not > involve any W+X memory. Emulate the delay slot in kernel-space. This > is trivial to do safely for pretty much everything but loads/stores. I think "trivial" is an understatement, you at least need to decode the delay-slot instruction enough to tell privileged and user instructions apart and send SIGILL where appropriate. Some user instructions send exceptions too and you need to handle them accordingly. OTOH, for things like ADDIUPC you need to interpret the instruction anyway, as the value of the PC used for calculation will be wrong except in the original location. > For loads/stores, where you want them to execute with user privilege > level, what you do is compute the effective address in kernel-space, > then return to a fixed instruction in the vdso page that performs a > generic load/store using the register the kernel put the effective > address result in, then restores registers off the stack and jumps to > the branch destination. What about all the odd and especially vendor-specific load/store instructions like ASET, SAA or SWAPW? Would we need to have all the possible encodings provided in the VDSO? Maciej