Re: [PATCH v2 bpf-next 1/5] ftrace: allow customized flags for ftrace_direct_multi ftrace_ops

Song Liu <songliubraving@xxxxxx> · Thu, 14 Jul 2022 04:37:43 +0000

> On Jul 13, 2022, at 7:55 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> 
> On Thu, 14 Jul 2022 01:42:59 +0000
> Song Liu <songliubraving@xxxxxx> wrote:
> 
>>> As I replied to patch 3, here's my thoughts.
>>> 
>>> DIRECT is treated as though it changes the IP. If you register it to a
>>> function that has an IPMODIFY already set to it, it will call the
>>> ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means
>>> a DIRECT can be shared with IPMODIFY). If it can, then it returns true,
>>> and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs
>>> should not have to manage this. It should be managed by the ftrace
>>> infrastructure.  
>> 
>> Hmm... I don't think this gonna work. 
>> 
>> First, two IPMODIFY ftrace ops cannot work together on the same kernel 
>> function. So there won't be a ops with both IPMODIFY and SHARE_IPMODIFY. 
> 
> I'm not saying that.
> 
> I'm saying that ftrace does not have any clue (nor cares) about what a
> DIRECT ops does. It might modify the IP or it might not. It doesn't know.
> 
> But ftrace has control over the callbacks it does control.
> 
> A DIRECT ops knows if it can work with another ops that has IPMODIFY set.
> If the DIRECT ops does not work with IPMODIFY (perhaps it wants to modify
> the IP), then it will tell ftrace that it can't and ftrace will not allow
> it.
> 
> That is, ftrace doesn't care if the DIRECT ops modifies the IP or not. It
> can only ask (through the ops->ops_func()) if the direct trampoline can
> honor the IP that is modified. If it can, then it reports back that it can,
> and then ftrace will set that ops to SHARED_MODIFY, and the direct function
> can switch the ops->func() to one that uses SHARED_MODIFY.
> 
>> 
>> non-direct ops without IPMODIFY can already share with IPMODIFY ops.
> 
> It can? ftrace sets IPMODIFY for all DIRECT callers to prevent that. Except
> for this patch that removes that restriction (which I believe is broken).

I mean "non-direct" ftrace ops, not direct ftrace ops. 

> 
>> Only direct ops need SHARE_IPMODIFY flag, and it means "I can share with 
>> another ops with IPMODIFY". So there will be different flavors of 
>> direct ops:
> 
> I agree that only DIRECT ops can have SHARED_IPMODIFY set. That's what I'm
> saying. But I'm saying it gets set by ftrace.
> 
>> 
>>  1. w/ IPMODIFY, w/o SHARE_IPMODIFY;
>>  2. w/o IPMODIFY, w/o SHARE_IPMODIFY;
>>  3. w/o IPMODIFY, w/ SHARE_IPMODIFY. 
>> 
>> #1 can never work on the same function with another IPMODIFY ops, and 
>> we don't plan to make it work. #2 does not work with another IPMODIFY 
>> ops. And #3 works with another IPMODIFY ops.
> 
> Lets look at this differently. What I'm proposing is that registering a
> direct ops does not need to tell ftrace if it modifies the IP or not.
> That's because ftrace doesn't care. Once ftrace calls the direct trampoline
> it loses all control. With the ftrace ops callbacks, it is the one
> responsible for setting up the modified IP. That's not the case with the
> direct trampolines.
> 
> I'm saying that ftrace doesn't care what the DIRECT ops does. But it will
> not, by default, allow an IPMODIFY to happen when a DIRECT ops is on the
> same function, or vice versa.
> 
> What I'm suggesting is when a IPMODIFY tries to attach to a function that
> also has a DIRECT ops, or a DIRECT tries to attach to a function that
> already has an IPMODIFY ops on it, that ftrace calls the direct
> ops->ops_func() asking if it is safe to use with an IPMODIFY function.
> 
> If the direct ops modifies the IP itself, it will return a "no", and ftrace
> will reject the attachment. If the direct ops can, it returns a "yes" and
> then ftrace will set the SHARED_IPMODIFY flag to that ops and continue.
> 
> The fact that the ops->ops_func was called will let the caller (bpf) know
> that it needs to use the stack to return to the function, or to call it if
> it is also tracing the return.
> 
> If the IPMODIFY ops is removed, then ftrace will call the ops->ops_func()
> telling it it no longer has the IPMODIFY set, and will clear the
> SHARED_IPMODIFY flag, and then the owner of the direct ops can go back to
> doing whatever it did before (the calling the function directly, or
> whatever).
> 
>> 
>> The owner of the direct trampoline uses these flags to tell ftrace 
>> infrastructure the property of this trampoline. 
> 
> I disagree. The owner shouldn't have to care about the flags. Let ftrace
> handle it. This will make things so much more simple for both BPF and
> ftrace.
> 
>> 
>> BPF trampolines with only fentry calls are #3 direct ops. BPF 
>> trampolines with fexit or fmod_ret calls will be #2 trampoline by 
>> default, but it is also possible to generate #3 trampoline for it.
> 
> And ftrace doesn't care about this. But bpf needs to care if the IP is
> being modified or not. Which can be done by the ops->ops_func() that you
> added.
> 
>> 
>> BPF side will try to register #2 trampoline, If ftrace detects another 
>> IPMODIFY ops on the same function, it will let BPF trampoline know 
>> with -EAGAIN from register_ftrace_direct_multi(). Then, BPF side will 
>> regenerate a #3 trampoline and register it again. 
> 
> This is too complex. You are missing the simple way.
> 
>> 
>> I know this somehow changes the policy with direct ops, but it is the
>> only way this can work, AFAICT. 
> 
> I disagree. There's a much better way that this can work.
> 
>> 
>> Does this make sense? Did I miss something?
> 
> 
> Let me start from the beginning.

I got your point now. We replace the flag on direct trampoline with a 
callback check. So yes, this works. 

> 
> 1. Live kernel patching patches function foo.
> 
> 2. lkp has an ops->flags | IPMODIFY set when it registers.
> 
> 3. bpf registers a direct trampoline to function foo.
> 
> 4. bpf has an ops->flags | DIRECT set when it registers
> 
> 5. ftrace sees that the function already has an ops->flag = IPMODIFY on it,
> so it calls bpf ops->ops_func(SHARE_WITH_IPMODIFY)
> 
> 6. bpf can and does the following
> 
>  a. if it's the simple #1 trampoline (just traces the start of a function)
>     it doesn't need to do anything special returns "yes".
> 
>  b. if it's the #2 trampoline, it will change the trampoline to use the
>     stack to find what to call, and returns "yes".
> 
> 7. ftrace gets "yes" and sets the *ipmodify* ops with SHARED_IPMODIFY
>   (there's a reason for setting this flag for the ipmodify ops and not the
>    direct ops).
> 
> 
> 8. LKP is removed from the function foo.
> 
> 9. ftrace sees the lkp IPMODIFY ops has SHARED_IPMODIFY on it, and knows
>   that there's a direct call here too. It removes the IPMODIFY ops, and
>   then calls the direct ops->ops_func(STOP_SHARE_WITH_IPMODIFY) to let the
>   direct code know that it is no longer sharing with an IPMODIFY such that
>   it can change to call the function directly and not use the stack.

I wonder whether we still need this flag. Alternatively, we can always
find direct calls on the function and calls ops_func(STOP_SHARE_WITH_IPMODIFY). 

What do you think about this? 

Thanks,
Song