On 1/24/23 4:02 PM, Kui-Feng Lee wrote:
The kernel crash was caused by a BPF program attached to the
"lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to
`bpf_setsockopt()` in order to set the TCP_NODELAY flag. This flag
Note that this race is not limited to TCP_NODELAY.
causes the kernel to flush the outgoing queue of a socket, and this
hook can be triggered during a softirq. The issue was that in certain
circumstances, when `tcp_write_xmit()` was called to flush the queue,
it would also allow BH (bottom-half) to run. This could lead to our
program attempting to flush the same socket recursively, which caused
a `skbuf` to be unlinked twice.
Thanks for the fix.
The commit message could use more details about this particular
security_sock_rcv_skb() hook. Something like,
security_sock_rcv_skb() is called from tcp_filter(). In tcp_v4_rcv(),
tcp_filter() is called before the sock_owned_by_user() check. If a bpf prog is
run in security_sock_rcv_skb() from the softirq, it may not own the sock lock
and break the bpf_setsockopt() assumption.
The patch fixes this issue by ensuring that a BPF program attached to
the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call
`bpf_setsockopt()`.
Please add Fixes tag. Also, this should target for the bpf tree.
Fixes: 9113d7e48e91 ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup")
Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxxxx>
---
kernel/bpf/bpf_lsm.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index a4a41ee3e80b..e14c822f8911 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -51,7 +51,6 @@ BTF_SET_END(bpf_lsm_current_hooks)
*/
BTF_SET_START(bpf_lsm_locked_sockopt_hooks)
#ifdef CONFIG_SECURITY_NETWORK
-BTF_ID(func, bpf_lsm_socket_sock_rcv_skb)
BTF_ID(func, bpf_lsm_sock_graft)
BTF_ID(func, bpf_lsm_inet_csk_clone)
BTF_ID(func, bpf_lsm_inet_conn_established)