On Tue, Jun 4, 2024 at 12:41 PM Jan Kara <jack@xxxxxxx> wrote: > Well, I can see your pain but after all the kernel does exactly what > userspace has asked for? That is a valid point of view; indeed the kernel's behavior is correct according to the specification, but that was not my point. This is about an exotic problem that occurs only in very rare circumstances (depending on hard disk speed, network speed and timing), but when it occurs, it blocks the calling process for a very long time, which can then cause problems more serious than user unhappiness (e.g. expiring timeouts). (As I said, nginx had to work around this problem.) I'd like to optimize this special case, and adjust the kernel to always behave like the common case. > After all there's no substantial difference between userspace issuing a 2GB read(2) and 2GB sendfile(2). I understand your fear of breaking userspace, but this doesn't apply here, because yes, there is indeed a substantial difference: in the normal case, sendfile() stops when the destination socket buffer is full. That is the normal mode of operation, which all applications must be prepared for, because short sendfile() calls happen all the time, that's the common case. My patch is ONLY about fixing that exotic special case where the socket buffer is drained over and over while sendfile() still runs. > there are too many userspace applications that depend on this behavior... True for read() - but which application depends on this very special behavior that only occurs in very rare exceptional cases? I think we have a slight misunderstanding about the circumstances of the problem.