io_uring force_nonblock vs POSIX_FADV_WILLNEED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

Currently io_uring executes fadvise in submission context except for
DONTNEED:

static int io_fadvise(struct io_kiocb *req, struct io_kiocb **nxt,
		      bool force_nonblock)
{
...
	/* DONTNEED may block, others _should_ not */
	if (fa->advice == POSIX_FADV_DONTNEED && force_nonblock)
		return -EAGAIN;

which makes sense for POSIX_FADV_{NORMAL, RANDOM, WILLNEED}, but doesn't
seem like it's true for POSIX_FADV_WILLNEED?

As far as I can tell POSIX_FADV_WILLNEED synchronously starts readahead,
including page allocation etc, which of course might trigger quite
blocking. The fs also quite possibly needs to read metadata.


Seems like either WILLNEED would have to always be deferred, or
force_page_cache_readahead, __do_page_cache_readahead would etc need to
be wired up to know not to block. Including returning EAGAIN, despite
force_page_cache_readahead and generic_readahead() intentially ignoring
return values / errors.

I guess it's also possible to just add a separate precheck that looks
whether there's any IO needing to be done for the range. That could
potentially also be used to make DONTNEED nonblocking in case everything
is clean already, which seems like it could be nice. But that seems
weird modularity wise.


Context: postgres has long used POSIX_FADV_WILLNEED to do a poor man's
version of async buffered reads, when it knows it needs to do a fair bit
of random reads that are already known (e.g. for bitmap heap scans).

Greetings,

Andres Freund



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux