Re: [RFC nf-next v3 1/2] netfilter: bpf: support prog update

"D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx> · Thu, 28 Dec 2023 19:06:39 +0800

On 12/28/23 3:00 AM, Alexei Starovoitov wrote:
On Wed, Dec 27, 2023 at 12:20 AM D. Wythe <alibuda@xxxxxxxxxxxxxxxxx> wrote:

Hi Alexei,

IMMO, nf_unregister_net_hook does not wait for the completion of the
execution of the hook that is being removed,
instead, it allocates a new array without the very hook to replace the
old arrayvia rcu_assign_pointer() (in __nf_hook_entries_try_shrink),
then it use call_rcu() to release the old one.

You can find more details in commit
8c873e2199700c2de7dbd5eedb9d90d5f109462b.

In other words, when nf_unregister_net_hook returns, there may still be
contexts executing hooks on the
old array, which means that the `link` may still be accessed after
nf_unregister_net_hook returns.

And that's the reason why we use kfree_rcu() to release the `link`.
                                                        nf_hook_run_bpf
                                                        const struct
bpf_nf_link *nf_link = bpf_link;

bpf_nf_link_release
       nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);

bpf_nf_link_dealloc
       free(link)
bpf_prog_run(link->prog);
Got it.
Sounds like it's an existing bug. If so it should be an independent
patch with Fixes tag.

Also please craft a test case to demonstrate UAF.

It is not an existing bug... Accessing the link within the hook was 
something I introduced here
to support updates😉, as previously there was no access to the link 
within the hook.
I must admit that it is indeed feasible if we eliminate the mutex and
use cmpxchg to swap the prog (we need to ensure that there is only one
bpf_prog_put() on the old prog).
However, when cmpxchg fails, it means that this context has not
outcompeted the other one, and we have to return a failure. Maybe
something like this:

if (!cmpxchg(&link->prog, old_prog, new_prog)) {
      /* already replaced by another link_update */
      return -xxx;
}

As a comparison, The version with the mutex wouldn't encounter this
error, every update would succeed. I think that it's too harsh for the
user to receive a failure
in that case since they haven't done anything wrong.
Disagree. The mutex doesn't prevent this issue.
There is always a race.
It happens when link_update.old_prog_fd and BPF_F_REPLACE
were specified.
One user space passes an FD of the old prog and
another user space doing the same. They both race and one of them
gets
if (old_prog && link->prog != old_prog) {
                err = -EPERM;

it's no different with dropping the mutex and doing:
if (old_prog) {
     if (!cmpxchg(&link->prog, old_prog, new_prog))
       -EPERM
} else {
    old_prog = xchg(&link->prog, new_prog);
}

Got it!  It's very helpful, Thanks very much! I will modify my patch 
accordingly.

Best wishes,
D. Wythe