On Fri, May 03, 2024 at 09:48:05AM -0700, John Fastabend wrote: > [ Upstream commit ebf2e8860eea66e2c4764316b80c6a5ee5f336ee] > [ Upstream commit f8dd95b29d7ef08c19ec9720564acf72243ddcf6] Why are you mushing 2 patches together? Why can't we just take the two as-is instead? That makes tracking everything much simpler and possible. > In the first patch, > > ebf2e8860eea ("tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg") > > This block of code is added to tcp_bpf_push(). The > tcp_bpf_push is the code used by BPF to submit messages into the TCP > stack. > > if (flags & MSG_SENDPAGE_NOTLAST) > msghdr.msg_flags | MSG_MORE; > > In the second patch, > > f8dd95b29d7e ("tcp_bpf, smc, tls, espintcp, siw: Reduce MSG_SENDPAGE_NOTLAST usage") > > this logic was further changed to, > > if (flags & MSG_SENDPAGE_NOTLAST) > msghdr.msg_flags |= MSG_MORE > > This was done as part of an improvement to use the sendmsg() callbacks > and remove the sendpage usage inside the various sub systems. > > However, these two patches together fixed a bug. The issue is without > MSG_MORE set we will break a msg up into many smaller sends. In some > case a lot because the operation loops over the scatter gather list. > Without the MSG_MORE set (the current 6.1 case) we see stalls in data > send/recv and sometimes applications failing to receive data. This > generally is the result of an application that gives up after calling > recv() or similar too many times. We introduce this because of how > we incorrectly change the TCP send pattern. > > Now that we have both 6.5 and 6.1 stable kernels deployed we've > observed a series of issues related to this in real deployments. In 6.5 > kernels all the HTTP and other compliance tests pass and we are not > observing any other issues. On 6.1 various compliance tests fail > (nginx for example), but more importantly in these clusters without > the flag set we observe stalled applications and increased retries in > other applications. Openssl users where we have annotations to monitor > retries and failures observed a significant increase in retries for > example. > > For the backport we isolated the fix to the two lines in the above > patches that fixed the code. With this patch we deployed the workloads > again and error rates and stalls went away and 6.1 stable kernels > perform similar to 6.5 stable kernels. Similarly the compliance tests > also passed. Can we just take the two original patches instead? thanks, greg k-h