On 12/09/2024 14:17, Breno Leitao wrote:
Hello Sabastian,
Thanks for the quick reply!
On Thu, Sep 12, 2024 at 02:28:47PM +0200, Sebastian Andrzej Siewior wrote:
On 2024-09-12 05:06:36 [-0700], Breno Leitao wrote:
Hello Sebastian, Jakub,
Hi,
I've seen some crashes in 6.11-rc7 that seems related to 401cb7dae8130
("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.").
Basically bpf_net_context is NULL, and it is being dereferenced by
bpf_net_ctx->ri.kern_flags (offset 0x38) in the following code.
static inline struct bpf_redirect_info *bpf_net_ctx_get_ri(void)
{
struct bpf_net_context *bpf_net_ctx = bpf_net_ctx_get();
if (!(bpf_net_ctx->ri.kern_flags & BPF_RI_F_RI_INIT)) {
That said, it means that bpf_net_ctx_get() is returning NULL.
This stack is coming from the bpf function bpf_redirect()
BPF_CALL_2(bpf_redirect, u32, ifindex, u64, flags)
{
struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
Since I don't think there is XDP involved, I wondering if we need some
preotection before calling bpf_redirect()
This origins in netkit_xmit(). If my memory serves me, then Daniel told
me that netkit is not doing any redirect and therefore does not need
"this". This must have been during one of the first "designs"/ versions.
Right, I've seen several crashes related to this, and in all of them it
is through netkit_xmit() -> netkit_run() -> bpf_prog_run()
If you are saying, that this is possible then something must be done.
Either assign a context or reject the bpf program.
If we want to assign a context, do you meant something like the
following?
Author: Breno Leitao <leitao@xxxxxxxxxx>
Date: Thu Sep 12 06:11:28 2024 -0700
netkit: Assign missing bpf_net_context.
During the introduction of struct bpf_net_context handling for
XDP-redirect, the netkit driver has been missed.
Set the bpf_net_context before invoking netkit_xmit() program within the
netkit driver.
Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.")
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
index 79232f5cc088..f8af57b7a1e8 100644
--- a/drivers/net/netkit.c
+++ b/drivers/net/netkit.c
@@ -65,6 +65,7 @@ static struct netkit *netkit_priv(const struct net_device *dev)
static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct net_device *dev)
{
+ struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx;
struct netkit *nk = netkit_priv(dev);
enum netkit_action ret = READ_ONCE(nk->policy);
netdev_tx_t ret_dev = NET_XMIT_SUCCESS;
@@ -72,6 +73,7 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct net_device *dev)
struct net_device *peer;
int len = skb->len;
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
rcu_read_lock();
Hi Breno,
looks like bpf_net_ctx should be set under rcu read lock...
peer = rcu_dereference(nk->peer);
if (unlikely(!peer || !(peer->flags & IFF_UP) ||
@@ -110,6 +112,7 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct net_device *dev)
break;
}
rcu_read_unlock();
+ bpf_net_ctx_clear(bpf_net_ctx);
return ret_dev;
}