Re: Pending splice(file -> FIFO) excludes all other FIFO operations forever (was: ... always blocks read(FIFO), regardless of O_NONBLOCK on read side?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/7/23 1:10?PM, Linus Torvalds wrote:
> On Fri, 7 Jul 2023 at 10:21, Christian Brauner <brauner@xxxxxxxxxx> wrote:
>>
>> Forgot to say, fwiw, I've been running this through the LTP splice,
>> pipe, and ipc tests without issues. A hanging reader can be signaled
>> away cleanly with this.
> 
> So that patch still has a couple of "wait for this" cases remaining.
> 
> In particular, when we do a read, and we do have pipe buffers, both
> the read() system call and a number of internal splice functions will
> go "Ahh, I have data", and then do pipe_buf_confirm() and read it.
> 
> Which then results in pipe_buf_confirm() blocking. It now blocks
> interruptibly, which is much nicer, but several of these users *could*
> just do a non-blocking confirmation instead, and wait for pipe
> readability.
> 
> HOWEVER, that's slightly less trivial than you'd expect, because the
> "wait for readability" needs to be done without the pipe lock held -
> so you can't actually check the pipe buffer state at that point (since
> you need the pipe lock to look up the buffer).
> 
> That's true even of "trivial" cases like actual user-space "read()
> with O_NONBLOCK and poll()" situations.
> 
> Now, the solution to all this is *fairly* straightforward:
> 
>  (a) don't use "!pipe_empty()" for a readability check.
> 
>      We already have "pipe_readable()", but it's hidden in fs/pipe.c,
> so all the splice() code ended up writing the "does this pipe have
> data" using "!pipe_empty()" instead.
> 
>  (b) make "pipe_buf_confirm()" take a "non-blocking" boolean argument,
> and if it is non-blocking but hits one of those blocked pages, set
> "pipe->not_ready", and return -EAGAIN.
> 
>      This is ok, because "pipe_buf_confirm()" is always under the pipe
> lock, and we'll just clear "pipe->not_ready" under the pipe lock after
> finalizing all those pages (and before waking up readers)
> 
>  (c) make "pipe_wait_readable()" and "poll()" know about this all, so
> that we wait properly for a pipe that was not ready to become ready
> 
> This all makes *most* users deal properly with these blocking events.
> In particular, things like splice_to_socket() can now do the whole
> proper "wait without holding the pipe lock" sequence, even when the
> pipe is not empty, just in this blocked state.
> 
> This *may* also make all the cases Jens had with io_uring and splicing
> JustWork(tm).

Exactly! I was reading this thread with excitement just now, would be
nice to get rid of that kludge.

> NOTE! NOTE! NOTE! Once more, this "feels right to me", and I'd argue
> that the basic approach is fairly straightfoward. The patch is also
> not horrendous. It all makes a fair amount of sense. BUT! I haven't
> tested this, and like the previous patch, I really would want people
> to think about this a lot.
> 
> Comments? Jens?

I'll take a closer look at this, but won't be until Monday most likely.
But the approach seems sane, and going in a more idiomatic direction
than before. So seems promising.

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux