This is a note to let you know that I've just added the patch titled tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit to the 4.19-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: tcp-return-epollout-from-tcp_poll-only-when-notsent_.patch and it can be found in the queue-4.19 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 397c4b513c6f7007aa916444c7e6a92f70c1b008 Author: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx> Date: Mon Sep 14 17:52:09 2020 -0400 tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit [ Upstream commit 8ba3c9d1c6d75d1e6af2087278b30e17f68e1fff ] If there was any event available on the TCP socket, tcp_poll() will be called to retrieve all the events. In tcp_poll(), we call sk_stream_is_writeable() which returns true as long as we are at least one byte below notsent_lowat. This will result in quite a few spurious EPLLOUT and frequent tiny sendmsg() calls as a result. Similar to sk_stream_write_space(), use __sk_stream_is_writeable with a wake value of 1, so that we set EPOLLOUT only if half the space is available for write. Signed-off-by: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx> Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Stable-dep-of: e14cadfd80d7 ("tcp: add annotations around sk->sk_shutdown accesses") Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 68f89fe7f9233..2fcf6e5a371dd 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -576,7 +576,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait) mask |= EPOLLIN | EPOLLRDNORM; if (!(sk->sk_shutdown & SEND_SHUTDOWN)) { - if (sk_stream_is_writeable(sk)) { + if (__sk_stream_is_writeable(sk, 1)) { mask |= EPOLLOUT | EPOLLWRNORM; } else { /* send SIGIO later */ sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); @@ -588,7 +588,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait) * pairs with the input side. */ smp_mb__after_atomic(); - if (sk_stream_is_writeable(sk)) + if (__sk_stream_is_writeable(sk, 1)) mask |= EPOLLOUT | EPOLLWRNORM; } } else