is it too heavy use tcp socket memory(defined in /proc/sys/net/ipv4/tcp_mem) to control sender not send too much data when we run sk_msg redirect sockmap whit TCP protocol? look forward for your reply. thanks so much. 在 2024/6/18 16:08, zhengguoyong 写道: > thanks for reply. > > i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free() > to check if memory is free. and in __sk_stream_memory_free(), if > sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk > notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then > __sk_stream_memory_free() will return false and do sk_stream_wait_memory(). > > but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to > recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path, > and __sk_stream_memory_free() will always return true. > > in bpf_tcp_ingress() will copy the sender msg and charge it, and in > tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg() > receive it from psock ingress_msg queue, and if receiver is not to read again > due to application bug, and sender continuous send, then the receiver > psock ingress_msg queue will continuous increase and cannot be uncharged > until tcp socket memory is not enough in the fllowing path. > > tcp_bpf_sendmsg > tcp_bpf_send_verdict > tcp_bpf_sendmsg_redir > bpf_tcp_ingress > sk_wmem_schedule > > so if a sk_msg type sockmap receiver is block, then it may consume all the > tcp socket memory and influence other tcp stream, > can we limit per sockmap tcp stream link sk->sk_sndbuf ? > > thanks. > > 在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道: >> 郑国勇 wrote: >>> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. >>> >> Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking >> sk_stream_memory_free() and will do sk_stream_wait_memory() if under >> memory pressure. This should handle sending case with lots of data >> queued up on the sk. >> >> On the redirect ingress case we do this, >> >> sk_psock_handle_skb() >> sk_psock_skb_ingress() >> sk_psock_create_ingress_msg() >> >> There sk_psock_create_ingress_msg() should check the rcvbuf of the >> receiving socket and shouldn't create a msg if its under memory pressure. >> If its an ingress_self case we do a skb_set_owner_r which should (?) push >> back on the memory side through sk_mem_charge(). >> >> Seems like I'm missing some case then if we are hitting this. What protocol >> and what is the BPF program? Is it a sender redirect? I guess more details >> might make it obvious to me. >> >> >>> sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, >>> >>> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. >>> >>> my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. >> The sender shouldn't be able to have more outstanding data than the >> socket memory allows. After the redirect the skb/msg should be >> charged to the receiving socket though. Agree sk_sendbuf should >> limit sender. Maybe the test is not TCP protocol and we missed >> adding the limits to UDP/AF_UNIX/etc? >> >>> thanks very much.