On Sun, 16 Dec 2018, Rich Felker wrote: > So in theory it's possible that there's a cpu model with fancy new > core instructions but no fpu. In this case, you would need the > capability to emulate or execute-out-of-line these instructions. But I > have no idea if such cpu models actually exist. If not, the concern > can probably be ignored and it suffices to emulate just the parts of > the base ISA that are valid in delay slots. What do you call "a cpu model with fancy new core instructions"? We've gone through 4 legacy MIPS base ISA revisions (I to IV) and then 4 modern ones that matter (R1 to R5; R4 was left out and R6 actually does not have FPU branch delay slots), plus a bunch of ASEs (Application Specific Extensions), such as DSP, MDMX, MIPS-3D, MSA, etc., each defining further instructions. And then the microMIPS R3 and R5 ISAs (R6 uses a different instruction encoding and does not have delay slots at all). The MIPS16 ISA does not count however, even though it has delay slots and we support it, because it does not have FPU instructions, let alone ones that require delay slot emulation. Some of the ASEs do not matter, e.g. we don't support MDMX in Linux as it has user state we don't handle with context switches, and MIPS-3D and MSA both imply an FPU, so software making use of them won't run with our FPU emulation anyway as these ASEs' instructions are not emulated. Anything else is potentially required. As to actual implementations I believe all the Cavium Octeon line CPUs (David, please correct me if I am wrong) have no FPU and they have vendor extensions beyond the base ISA + ASE instruction set. Arguably you could say that their additional instructions should not be scheduled into FPU branch delay slots then, however the toolchain will happily do that, as I wrote before. I don't fully remember what the situation is WRT NetLogic/Broadcom XLR and XLP chips. They do have vendor extensions, though IIRC they do have an FPU too. But then we have the "nofpu" kernel parameter anyway, which forces FPU emulation for any hardware, so we need to emulate delay slots in that mode with any hardware. I'm afraid the problem is complex to solve overall, which is why we still have issues, 18 years on from the inclusion of the FPU emulator: commit 4c55adaa6d06e5533aebaceea7640ecf10952231 Author: Ralf Baechle <ralf@xxxxxxxxxxxxxx> Date: Sat Nov 25 04:49:46 2000 +0000 Kernel FPU emulator, chain saw edition. (in the LMO GIT repo) and I think actually running the delay-slot instruction (with a possible exception for things like ADDIUPC) rather than interpreting it is the only feasible solution. I'm not involved with MIPS architecture development anymore though and at this point I only care about the few legacy platforms I have been taking care of since forever, such as the DECstation port, for which our current emulation solution suffices, so I am not going to commit myself to making any inventions in this area. I hope my input is valuable though and will help someone working on this. Maciej