On 9/8/23 2:00 PM, Stanislav Fomichev wrote:
Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped
packets") exposed the fact that bpf_clone_redirect is capable of
returning raw NET_XMIT_XXX return codes.
This is in the conflict with its UAPI doc which says the following:
"0 on success, or a negative error in case of failure."
Let's wrap dev_queue_xmit's return value (in __bpf_tx_skb) into
net_xmit_errno to make sure we correctly propagate NET_XMIT_DROP
as -ENOBUFS instead of 1.
Note, this is technically breaking existing UAPI where we used to
return 1 and now will do -ENOBUFS. The alternative is to
document that bpf_clone_redirect can return 1 for DROP and 2 for CN.
Reported-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
---
net/core/filter.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index a094694899c9..9e297931b02f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
ret = dev_queue_xmit(skb);
dev_xmit_recursion_dec();
+ if (ret > 0)
+ ret = net_xmit_errno(ret);
I think it is better to have bpf_clone_redirect returning -ENOBUFS instead of
leaking NET_XMIT_XXX to the uapi. The bpf_clone_redirect in the uapi/bpf.h also
mentions
* Return
* 0 on success, or a negative error in case of failure.
If -ENOBUFS is returned in __bpf_tx_skb, should the same be done for
__bpf_rx_skb? and should net_xmit_errno() only be done for bpf_clone_redirect()?
__bpf_{tx,rx}_skb is also used by skb_do_redirect() which also calls
__bpf_redirect_neigh() that returns NET_XMIT_xxx but no caller seems to care the
NET_XMIT_xxx value now.
Daniel should know more here. I would wait for Daniel to comment.
~~~~
For the selftest, may be another option is to use a 28 bytes data_in for the lwt
program redirecting to veth? 14 bytes used by bpf_prog_test_run_skb and leave 14
bytes for veth_xmit. It seems the original intention of the "veth ETH_HLEN+1
packet ingress" test is expecting it to succeed also.
+
return ret;
}