Re: [PATCH v2 bpf-next 5/5] bpf: trampoline: support FTRACE_OPS_FL_SHARE_IPMODIFY

Steven Rostedt <rostedt@xxxxxxxxxxx> · Wed, 6 Jul 2022 18:29:31 -0400

On Wed, 6 Jul 2022 22:15:47 +0000
Song Liu <songliubraving@xxxxxx> wrote:

> > On Jul 6, 2022, at 2:40 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > 
> > On Wed, 6 Jul 2022 21:37:52 +0000
> > Song Liu <songliubraving@xxxxxx> wrote:
> >   
> >>> Can you comment here that returning -EAGAIN will not cause this to repeat.
> >>> That it will change things where the next try will not return -EGAIN?    
> >> 
> >> Hmm.. this is not the guarantee here. This conflict is a real race condition 
> >> that an IPMODIFY function (i.e. livepatch) is being registered at the same time 
> >> when something else, for example bpftrace, is updating the BPF trampoline. 
> >> 
> >> This EAGAIN will propagate to the user of the IPMODIFY function (i.e. livepatch),
> >> and we need to retry there. In the case of livepatch, the retry is initiated 
> >> from user space.   
> > 
> > We need to be careful here then. If there's a userspace application that
> > runs at real-time and does a:
> > 
> > 	do {
> > 		errno = 0;
> > 		regsiter_bpf();
> > 	} while (errno != -EAGAIN);  
> 
> Actually, do you mean:
> 
> 	do {
> 		errno = 0;
> 		regsiter_bpf();
> 	} while (errno == -EAGAIN);
> 
> (== -EAGAIN) here?

Yeah, of course.

> 
> In this specific race condition, register_bpf() will succeed, as it already
> got tr->mutex. But the IPMODIFY (livepatch) side will fail and retry. 

What else takes the tr->mutex ?

If it preempts anything else taking that mutex, when this runs, then it
needs to be careful.

You said this can happen when the live patch came first. This isn't racing
against live patch, it's racing against anything that takes the tr->mutex
and then adds a bpf trampoline to a location that has a live patch.

> 
> Since both livepatch and bpf trampoline changes are rare operations, I think 
> the chance of the race condition is low enough. 
> 
> Does this make sense?
> 

It's low, and if it is also a privileged operation then there's less to be
concern about. As if it is not, then we could have a way to deadlock the
system. I'm more concerned that this will lead to a CVE than it just
happening randomly. In other words, it only takes something that can run at
a real-time priority to connect to a live patch location, and something
that runs at a low priority to take a tr->mutex. If an attacker has both,
then it can pin both to a CPU and then cause the deadlock to the system.

One hack to fix this is to add a msleep(1) in the failed case of the
trylock. This will at least give the owner of the lock a millisecond to
release it. This was what the RT patch use to do with spin_trylock() that
was converted to a mutex (and we worked hard to remove all of them).

-- Steve