On 11/11/24 10:25 AM, Matthew Wilcox wrote: > On Sun, Nov 10, 2024 at 08:27:52AM -0700, Jens Axboe wrote: >> 5 years ago I posted patches adding support for RWF_UNCACHED, as a way >> to do buffered IO that isn't page cache persistent. The approach back >> then was to have private pages for IO, and then get rid of them once IO >> was done. But that then runs into all the issues that O_DIRECT has, in >> terms of synchronizing with the page cache. > > Today's a holiday, and I suspect you're going to do a v3 before I have > a chance to do a proper review of this version of the series. Probably, since I've done some fixes since v2 :-). So you can wait for v3, I'll post it later today anyway. > I think "uncached" isn't quite the right word. Perhaps 'RWF_STREAMING' > so that userspace is indicating that this is a streaming I/O and the > kernel gets to choose what to do with that information. Yeah not sure, it's the one I used back in the day, and I still haven't found a more descriptive word for it. That doesn't mean one doesn't exist, certainly taking suggestions. I don't think STREAMING is the right one however, you could most certainly be doing random uncached IO. > Also, do we want to fail I/Os to filesystems which don't support > it? I suppose really sophisticated userspace might fall back to > madvise(DONTNEED), but isn't most userspace going to just clear the flag > and retry the I/O? Also something that's a bit undecided, you can make arguments for both ways. For just ignoring the flag if not support, the argument would be that the application just wants to do IO, uncached if available. For the other argument, maybe you have an application that wants to fallback to O_DIRECT if uncached isn't available. That application certainly wants to know if it works or not. Which is why I defaulted to return -EOPNOTSUPP if it's not available. An applicaton may probe this upfront if it so desires, and just not set the flag for IO. That'd keep it out of the hot path. Seems to me that returning whether it's supported or not is the path of least surprises for applications, which is why I went that way. > Um. Now I've looked, we also have posix_fadvise(POSIX_FADV_NOREUSE), > which is currently a noop. But would we be better off honouring > POSIX_FADV_NOREUSE than introducing RWF_UNCACHED? I'll think about this > some more while I'm offline. That would certainly work too, for synchronous IO. But per-file hints are a bad idea for async IO, for obvious reasons. We really want per-IO hints for that, we have a long history of messing that up. That doesn't mean that FMODE_NOREUSE couldn't just set RWF_UNCACHED, if it's set. That'd be trivial. Then the next question is if setting POSIX_FADV_NOREUSE should fail of file->f_op->fop_flags & FOP_UNCACHED isn't true. Probably not, since it'd potentially break applications. So probably best to just set f_iocb_flags IFF FOP_UNCACHED is true for that file. And the bigger question is why on earth do we have this thing in the kernel that doesn't do anything... But yeah, now we could make it do something. -- Jens Axboe