On 16.10.2018 16:29, James Bottomley wrote: > On Tue, 2018-10-16 at 07:34 +0200, Helge Deller wrote: >> On 15.10.2018 23:11, James Bottomley wrote: >>> On Sun, 2018-10-14 at 20:34 +0200, Helge Deller wrote: >>>> This patch adds the necessary code to patch a running SMP kernel >>>> at runtime to improve performance when running on a single CPU. >>>> >>>> The current implementation offers two patching variants: >>>> - Unwanted assembler statements like locking functions are >>>> overwritten >>>> with NOPs. When multiple instructions shall be skipped, one >>>> branch >>>> instruction is used instead of multiple nop instructions. >>> >>> This seems like a good idea because our spinlocks are particularly >>> heavyweight. >>> >>>> - Some pdtlb and pitlb instructions are patched to become pdtlb,l >>>> and >>>> pitlb,l which only flushes the CPU-local tlb entries instead of >>>> broadcasting the flush to other CPUs in the system and thus may >>>> improve performance. >>> >>> I really don't think this matters: on a UP system, ptdlb,l and >>> pdtlb are the same instruction because the CPU already knows is has >>> no internal CPU bus to broadcast the purge over so it in effect >>> executes a pdtlb,l regardless. >> >> I'd be happy to drop this part again. >> But is that true on a SMP system, where one has booted with >> maxcpus=1, too? > > I don't think so because the secondaries will all be in their active > boot loops, so the internal coherence bus will also be active. It's > not really clear that's a common case, though ... Ok, since it doesn't hurt to keep the pdtlb->pdtlb,l replacement I think we simply keep it. It doesn't generate overhead either. Helge