On Mon, Apr 29, 2019 at 11:06 AM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > It does *not* emulate the "call" in the BP handler itself, instead if > replace the %ip (the same way all the other BP handlers replace the > %ip) with a code sequence that just does > > push %gs:bp_call_return > jmp *%gs:bp_call_target > > after having filled in those per-cpu things. Note that if you read the patch, you'll see that my explanation glossed over the "what if an interrupt happens" part. Which is handled by having two handlers, one for "interrupts were already disabled" and one for "interrupts were enabled, so I disabled them before entering the handler". The second handler does the same push/jmp sequence, but has a "sti" before the jmp. Because of the one-instruction sti shadow, interrupts won't actually be enabled until after the jmp instruction has completed, and thus the "push/jmp" is atomic wrt regular interrupts. It's not safe wrt NMI, of course, but since NMI won't be rescheduling, and since any SMP IPI won't be punching through that sequence anyway, it's still atomic wrt _another_ text_poke() attempt coming in and re-using the bp_call_return/tyarget slots. So yeah, it's not "one-liner" trivial, but it's not like it's complicated either, and it actually matches the existing "call this code to emulate the replaced instruction". So I'd much rather have a couple of tens of lines of code here that still acts pretty much exactly like all the other rewriting does, rather than play subtle games with the entry stack frame. Finally: there might be other situations where you want to have this kind of "pseudo-atomic" replacement sequence, so I think while it's a hack specific to emulating a "call" instruction, I don't think it is conceptually limited to just that case. Linus