On Wed, Dec 11, 2019 at 7:29 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > > Comments appreciated! This should work on any standard file system, > using either the generic helpers or iomap. I have tested ext4 and xfs > for the right read/write behavior, but no further validation has been > done yet. Patches are against current git, and can also be found here: I don't think this is conceptually wrong, but the implementation smells a bit. I commented on the trivial part (the horrendous argument list to iomap_actor), but I wonder how much of the explicit invalidation is actually needed? Because active invalidation really is a horrible horrible thing to do. It immediately means that you can't use this interface for normal everyday things that may actually cache perfectly fine. What happens if you simply never _activate_ the page? Then they should get invalidated on their own, without impacting any other load - but only when there is _some_ memory pressure. They'll just stay on the inactive lru list, and get re-used quickly. Note that there are two ways to activate a page: the "accessed while on the inactive list" will activate it, but these days we also have a "pre-activate" path in the workingset code (see workingset_refault()). Even if you might also want an explicit invalidate path, I would like to hear what it looks like if you instead of - or in addition to - invalidating, have a "don't activate" flag. We don't have all _that_ many places where we activate pages, and they should be easy to find (just grep for "SetPageActive()"), although the call chain may make it a bit painful to add a "don't do it for this access" kind of things. But I think most of the regular IO call chains come through "mark_page_accessed()". So _that_ is the part you want to avoid (and maybe the workingset code). And that should be fairly straightforward, I think. In fact, that you say that just a pure random read case causes lots of kswapd activity makes me think that maybe we've screwed up page activation in general, and never noticed (because if you have enough memory, you don't really see it that often)? So this might not be an io_ring issue, but an issue in general. Linus