On 12/11/19 1:03 PM, Linus Torvalds wrote: > On Wed, Dec 11, 2019 at 11:34 AM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> I can't tell a difference in the results, there's no discernable >> difference between NOT calling mark_page_accessed() or calling it. >> Behavior seems about the same, in terms of pre and post page cache full, >> and kswapd still churns a lot once the page cache is filled up. > > Yeah, that sounds like a bug. I'm sure the RWF_UNCACHED flag fixes it > when you do the IO that way, but it seems to be a bug relardless. Hard to disagree with that. > Does /proc/meminfo have everything inactive for file data (ie the > "Active(file)" line is basically zero?). $ cat /proc/meminfo | grep -i active Active: 134136 kB Inactive: 28683916 kB Active(anon): 97064 kB Inactive(anon): 4 kB Active(file): 37072 kB Inactive(file): 28683912 kB This is after a run with RWF_NOACCESS. > Maybe pages got activated other ways (eg a problem with the workingset > code)? You said "See patch below", but there wasn't any. Oops, now below. > > That said, it's also entirely possible that even with everything in > the inactive list, we might try to shrink other things first for > whatever odd reason.. > > The fact that you see that xas_create() so prominently would imply > perhaps add_to_swap_cache(), which certainly implies that the page > shrinking isn't hitting the file pages... That's presumably misleading, as it's just lookups. But yes, confusing... diff --git a/include/linux/fs.h b/include/linux/fs.h index 5ea5fc167524..b2ecc66f5bd5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -316,6 +316,7 @@ enum rw_hint { #define IOCB_WRITE (1 << 6) #define IOCB_NOWAIT (1 << 7) #define IOCB_UNCACHED (1 << 8) +#define IOCB_NOACCESS (1 << 9) struct kiocb { struct file *ki_filp; @@ -3423,6 +3424,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) ki->ki_flags |= IOCB_APPEND; if (flags & RWF_UNCACHED) ki->ki_flags |= IOCB_UNCACHED; + if (flags & RWF_NOACCESS) + ki->ki_flags |= IOCB_NOACCESS; return 0; } diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 357ebb0e0c5d..f20f0048d5c5 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -302,8 +302,10 @@ typedef int __bitwise __kernel_rwf_t; /* drop cache after reading or writing data */ #define RWF_UNCACHED ((__force __kernel_rwf_t)0x00000040) +#define RWF_NOACCESS ((__force __kernel_rwf_t)0x00000080) + /* mask of flags supported by the kernel */ #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ - RWF_APPEND | RWF_UNCACHED) + RWF_APPEND | RWF_UNCACHED | RWF_NOACCESS) #endif /* _UAPI_LINUX_FS_H */ diff --git a/mm/filemap.c b/mm/filemap.c index 4dadd1a4ca7c..c37b0e221a8a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2058,7 +2058,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, if (iocb->ki_flags & IOCB_NOWAIT) goto would_block; /* UNCACHED implies no read-ahead */ - if (iocb->ki_flags & IOCB_UNCACHED) + if (iocb->ki_flags & (IOCB_UNCACHED|IOCB_NOACCESS)) goto no_cached_page; page_cache_sync_readahead(mapping, ra, filp, @@ -2144,7 +2144,8 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, * When a sequential read accesses a page several times, * only mark it as accessed the first time. */ - if (prev_index != index || offset != prev_offset) + if ((prev_index != index || offset != prev_offset) && + !(iocb->ki_flags & IOCB_NOACCESS)) mark_page_accessed(page); prev_index = index; -- Jens Axboe