Re: [PATCHSET v8 0/12] Uncached buffered IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 20 Dec 2024 08:47:38 -0700 Jens Axboe <axboe@xxxxxxxxx> wrote:

> So here's a new approach to the same concent, but using the page cache
> as synchronization. Due to excessive bike shedding on the naming, this
> is now named RWF_DONTCACHE, and is less special in that it's just page
> cache IO, except it prunes the ranges once IO is completed.
> 
> Why do this, you may ask? The tldr is that device speeds are only
> getting faster, while reclaim is not. Doing normal buffered IO can be
> very unpredictable, and suck up a lot of resources on the reclaim side.
> This leads people to use O_DIRECT as a work-around, which has its own
> set of restrictions in terms of size, offset, and length of IO. It's
> also inherently synchronous, and now you need async IO as well. While
> the latter isn't necessarily a big problem as we have good options
> available there, it also should not be a requirement when all you want
> to do is read or write some data without caching.

Of course, we're doing something here which userspace could itself do:
drop the pagecache after reading it (with appropriate chunk sizing) and
for writes, sync the written area then invalidate it.  Possible
added benefits from using separate threads for this.

I suggest that diligence requires that we at least justify an in-kernel
approach at this time, please.

And there's a possible middle-ground implementation where the kernel
itself kicks off threads to do the drop-behind just before the read or
write syscall returns, which will probably be simpler.  Can we please
describe why this also isn't acceptable?


Also, it seems wrong for a read(RWF_DONTCACHE) to drop cache if it was
already present.  Because it was presumably present for a reason.  Does
this implementation already take care of this?  To make an application
which does read(/etc/passwd, RWF_DONTCACHE) less annoying?


Also, consuming a new page flag isn't a minor thing.  It would be nice
to see some justification around this, and some decription of how many
we have left.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux