Re: [PATCH v3] fs/splice: don't block splice_direct_to_actor() after data was read

Max Kellermann <max.kellermann@xxxxxxxxx> · Tue, 4 Jun 2024 13:14:05 +0200

On Tue, Jun 4, 2024 at 12:41 PM Jan Kara <jack@xxxxxxx> wrote:
> Well, I can see your pain but after all the kernel does exactly what
> userspace has asked for?

That is a valid point of view; indeed the kernel's behavior is correct
according to the specification, but that was not my point.

This is about an exotic problem that occurs only in very rare
circumstances (depending on hard disk speed, network speed and
timing), but when it occurs, it blocks the calling process for a very
long time, which can then cause problems more serious than user
unhappiness (e.g. expiring timeouts). (As I said, nginx had to work
around this problem.)

I'd like to optimize this special case, and adjust the kernel to
always behave like the common case.

> After all there's no substantial difference between userspace issuing a 2GB read(2) and 2GB sendfile(2).

I understand your fear of breaking userspace, but this doesn't apply
here, because yes, there is indeed a substantial difference: in the
normal case, sendfile() stops when the destination socket buffer is
full. That is the normal mode of operation, which all applications
must be prepared for, because short sendfile() calls happen all the
time, that's the common case.

My patch is ONLY about fixing that exotic special case where the
socket buffer is drained over and over while sendfile() still runs.

> there are too many userspace applications that depend on this behavior...

True for read() - but which application depends on this very special
behavior that only occurs in very rare exceptional cases? I think we
have a slight misunderstanding about the circumstances of the problem.