Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still full

Oleg Nesterov <oleg@xxxxxxxxxx> · Thu, 27 Feb 2025 17:28:32 +0100

On 02/27, Mateusz Guzik wrote:
>
> On Thu, Feb 27, 2025 at 1:51 PM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > Sapkal, I don't think this can explain the hang, receiver()->read()
> > should wake this writer later anyway. But could you please retest
> > with the patch below?
> >
> > Thanks,
> >
> > Oleg.
> > ---
> >
> > diff --git a/fs/pipe.c b/fs/pipe.c
> > index b0641f75b1ba..222881559c30 100644
> > --- a/fs/pipe.c
> > +++ b/fs/pipe.c
> > @@ -455,6 +455,7 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
> >          * page-aligns the rest of the writes for large writes
> >          * spanning multiple pages.
> >          */
> > +again:
> >         head = pipe->head;
> >         was_empty = pipe_empty(head, pipe->tail);
> >         chars = total_len & (PAGE_SIZE-1);
> > @@ -559,8 +560,8 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
> >                 kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
> >                 wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));
> >                 mutex_lock(&pipe->mutex);
> > -               was_empty = pipe_empty(pipe->head, pipe->tail);
> >                 wake_next_writer = true;
> > +               goto again;
> >         }
> >  out:
> >         if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
> >
>
> I think this is buggy.
>
> You get wakeups also when the last reader goes away. The for loop you
> are jumping out of makes sure to check for the condition, same for the
> first mutex acquire. With this goto you can get a successful write
> instead of getting SIGPIPE. iow this should goto few lines higher.

Yes, yes, and then we need to remove another pipe->readers check
in the main loop.

> I am not sure about the return value. The for loop bumps ret with each
> write, but the section you are jumping to overwrites it.

Ah, yes, thanks, I missed that.

OK, I'll make another one tomorrow, I need to run away.

Until then, it would be nice to test this patch with hackbench anyway.

> However, I do think something may be going on with the "split" ops,
> which is why I suggested going from 100 bytes where the bug was
> encountered to 128 for testing purposes. If that cleared it, that
> would be nice for sure. :>

Yes, but note that the same scenario can happen with 128 bytes as well.
It doesn't really matter how many bytes < PAGE_SIZE the sleeping writer
needs to write, another writer can steal the buffer released by the
last reader in any case.

Thanks!

Oleg.