On 12/28/23 3:00 AM, Alexei Starovoitov wrote:
On Wed, Dec 27, 2023 at 12:20 AM D. Wythe <alibuda@xxxxxxxxxxxxxxxxx> wrote:
Hi Alexei,
IMMO, nf_unregister_net_hook does not wait for the completion of the
execution of the hook that is being removed,
instead, it allocates a new array without the very hook to replace the
old arrayvia rcu_assign_pointer() (in __nf_hook_entries_try_shrink),
then it use call_rcu() to release the old one.
You can find more details in commit
8c873e2199700c2de7dbd5eedb9d90d5f109462b.
In other words, when nf_unregister_net_hook returns, there may still be
contexts executing hooks on the
old array, which means that the `link` may still be accessed after
nf_unregister_net_hook returns.
And that's the reason why we use kfree_rcu() to release the `link`.
nf_hook_run_bpf
const struct
bpf_nf_link *nf_link = bpf_link;
bpf_nf_link_release
nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);
bpf_nf_link_dealloc
free(link)
bpf_prog_run(link->prog);
Got it.
Sounds like it's an existing bug. If so it should be an independent
patch with Fixes tag.
Also please craft a test case to demonstrate UAF.
It is not an existing bug... Accessing the link within the hook was
something I introduced here
to support updates😉, as previously there was no access to the link
within the hook.
I must admit that it is indeed feasible if we eliminate the mutex and
use cmpxchg to swap the prog (we need to ensure that there is only one
bpf_prog_put() on the old prog).
However, when cmpxchg fails, it means that this context has not
outcompeted the other one, and we have to return a failure. Maybe
something like this:
if (!cmpxchg(&link->prog, old_prog, new_prog)) {
/* already replaced by another link_update */
return -xxx;
}
As a comparison, The version with the mutex wouldn't encounter this
error, every update would succeed. I think that it's too harsh for the
user to receive a failure
in that case since they haven't done anything wrong.
Disagree. The mutex doesn't prevent this issue.
There is always a race.
It happens when link_update.old_prog_fd and BPF_F_REPLACE
were specified.
One user space passes an FD of the old prog and
another user space doing the same. They both race and one of them
gets
if (old_prog && link->prog != old_prog) {
err = -EPERM;
it's no different with dropping the mutex and doing:
if (old_prog) {
if (!cmpxchg(&link->prog, old_prog, new_prog))
-EPERM
} else {
old_prog = xchg(&link->prog, new_prog);
}
Got it! It's very helpful, Thanks very much! I will modify my patch
accordingly.
Best wishes,
D. Wythe