Re: [RFC PATCH 1/1] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 24 Apr 2019 08:20:12 +0200
Nicolai Stange <nstange@xxxxxxx> wrote:

> Alright, if there's a use case beyond live patching, I can try to handle
> 32 bits alongside, of course.
> 
> However, some care will be needed when determining the actual context
> from ftrace_int3_handler(): tracing anything before the addition or
> after the subtraction of HARDIRQ_OFFSET to/from preempt_count in
> irq_enter() resp. irq_exit() could otherwise clobber the "normal" state
> in thread_info, correct?
> 
> How about having a fixed size stack with three entries shared between
> "normal", "irq" and "softirq" in thread_info, as well as a dedicated
> slot for nmi context? (The stack would be modified/consumed only with
> irqs disabled).
> 
> Which makes me wonder:
> - if we disabled preemption from ftrace_int3_handler() and reenabled it
>   again from the *_int3_tramp*s, these stacks could be made per-CPU,
>   AFAICS,

Heh, here you go again, having some of the ideas I came up with as well.

I thought about this use per-cpu variables and this preempt disable
hack as well. But thinking more on it, it goes into the "too crazy, and
too fragile" category. It breaks a lot of norms (but I do that a lot
anyway ;) and decided that it would probably not be accepted by others.

I much rather use the above thread info solution. As for the
irq_enter/exit() issue, instead of using the preempt_count context, we
could just have a counter in the thread info as well:

	depth = local_add_return(thread_info->ftrace_int_depth);
	thread_info->slot[depth] = *(pt_regs->sp);

Where thread_info->slot is an array of 4 long words.

And then in the trampoline: (pseudo code)

	push	%rax	// just a place holder
	push	%rax
	push	%rdx
	mov	thread->depth, %rax
	mov	thread->slot(%rax), %rdx
	mov	%rdx, 16(%rsp)
	add	-1,%rax
	mov	%rax, thread->depth
	pop	%rdx
	pop	%rax
	jmp	ftrace_caller
	
If an interrupt were to come in at any time, it would just increment
the depth, use the next slot, and then decrement it back to what the
depth was when it interrupted the code.

-- Steve

> - and why not use this strategy on x86_64, too? The advantages would be
>   a common implemention between 32 and 64 bit and more importantly, not
>   relying on that %r11 hack. When doing the initial RFC patch, it
>   wasn't clear to me that at most one stack slot per context would be
>   needed, i.e. only four in total...




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux