Steven Rostedt <rostedt@xxxxxxxxxxx> writes: > On Tue, 23 Apr 2019 20:15:49 +0200 > Nicolai Stange <nstange@xxxxxxx> wrote: >> Steven Rostedt <rostedt@xxxxxxxxxxx> writes: >> > For 32 bit, we could add 4 variables on the thread_info and make 4 >> > trampolines, one for each context (normal, softirq, irq and NMI), and >> > have them use the variable stored in the thread_info for that location: >> > >> > ftrace_int3_tramp_normal >> > push %eax # just for space >> > push %eax >> > mov whatever_to_get_thread_info, %eax >> > mov normal(%eax), %eax >> > mov %eax, 4(%esp) >> > pop %eax >> > jmp ftrace_caller >> > >> > ftrace_int3_tramp_softiqr >> > push %eax # just for space >> > push %eax >> > mov whatever_to_get_thread_info, %eax >> > mov softirq(%eax), %eax >> > mov %eax, 4(%esp) >> > pop %eax >> > jmp ftrace_caller >> > >> > etc.. >> > >> > >> > A bit crazier for 32 bit but it can still be done. ;-) >> >> Ok, but currently CONFIG_HAVE_LIVEPATCH=n for x86 && !x86_64, >> so I'd rather not invest too much energy into screwing 32 bit up ;) >> > > Probably not something you care about, but something that I do. Which > means it will have to go on my TODO list. I care about missed functions > being called. This means, if you have something tracing a function, and > then enable something else to trace that same function, you might miss > the first one. Alright, if there's a use case beyond live patching, I can try to handle 32 bits alongside, of course. However, some care will be needed when determining the actual context from ftrace_int3_handler(): tracing anything before the addition or after the subtraction of HARDIRQ_OFFSET to/from preempt_count in irq_enter() resp. irq_exit() could otherwise clobber the "normal" state in thread_info, correct? How about having a fixed size stack with three entries shared between "normal", "irq" and "softirq" in thread_info, as well as a dedicated slot for nmi context? (The stack would be modified/consumed only with irqs disabled). Which makes me wonder: - if we disabled preemption from ftrace_int3_handler() and reenabled it again from the *_int3_tramp*s, these stacks could be made per-CPU, AFAICS, - and why not use this strategy on x86_64, too? The advantages would be a common implemention between 32 and 64 bit and more importantly, not relying on that %r11 hack. When doing the initial RFC patch, it wasn't clear to me that at most one stack slot per context would be needed, i.e. only four in total... What do you think? Nicolai -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)