Hello John, For background, I have a sk_skb program where for certain flows, a subset of request handling logic (on Rx) that is typically done in user space is handed over to a BPF program for doing it in-kernel. Since the protocol is over TCP, I'm using sk_skb programs to handle these requests and generate responses from within the kernel. The user space application will decide which flow should be added to the sockmap on socket accept and then let it be handled in the kernel for some of the request types. However, when generating replies and trimming away extra bytes in the packet by using bpf_skb_adjust_room, I see packet drops because the strp_msg->full_len is not updated unless tls_sw_has_ctx_rx is true for the socket. Removing that condition makes everything work, but I am not sure that is the right fix. My understanding so far is that in sk_psock_backlog, sk_psock_handle_skb returns an error since the stm->full_len is not correct (and passed into it as len), and SK_PSOCK_TX_ENABLED gets cleared due to this error. Then, on the redirect side (sk_psock_skb_redirect), it does sock_drop because the test for the SK_PSOCK_TX_ENABLED flag fails, which does kfree_skb. I have attached a patch with a selftest (skb_adjust_room) that you can run to verify the problem. It should basically hang on the second read as the packet would be dropped by the kernel. The first read goes through fine. Applying the diff below allows the selftest to pass. I am not familiar with the code to know whether the fix below is correct, but it seems to resolve the problem for me (and I carried it for a while and ran my stuff on it until I got around to reporting this now). In case gmail screws up the formatting, I'm just removing the tls_sw_has_ctx_rx(skb->sk) conditional on the strp_msg->full_len update in sk_skb_adjust_room. diff --git a/net/core/filter.c b/net/core/filter.c index 524adf1fa6d0..e3009d0f3352 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3622,11 +3622,7 @@ BPF_CALL_4(sk_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, return -ENOMEM; __skb_pull(skb, len_diff_abs); } - if (tls_sw_has_ctx_rx(skb->sk)) { - struct strp_msg *rxm = strp_msg(skb); - - rxm->full_len += len_diff; - } + strp_msg(skb)->full_len += len_diff; return ret; } -- Thanks!
Attachment:
test.patch
Description: Binary data