From: Shakeel Butt <shakeelb@xxxxxxxxxx> Subject: mm: fadvise: avoid fadvise for fs without backing device The fadvise() manpage is silent on fadvise()'s effect on memory-based filesystems (shmem, hugetlbfs & ramfs) and pseudo file systems (procfs, sysfs, kernfs). The current implementaion of fadvise is mostly a noop for such filesystems except for FADV_DONTNEED which will trigger expensive remote LRU cache draining. This patch makes the noop of fadvise() on such file systems very explicit. However this change has two side effects for ramfs and one for tmpfs. First fadvise(FADV_DONTNEED) could remove the unmapped clean zero'ed pages of ramfs (allocated through read, readahead & read fault) and tmpfs (allocated through read fault). Also fadvise(FADV_WILLNEED) could create such clean zero'ed pages for ramfs. This change removes those possibilities. One of our generic libraries does fadvise(FADV_DONTNEED). Recently we observed high latency in fadvise() and noticed that the users have started using tmpfs files and the latency was due to expensive remote LRU cache draining. For normal tmpfs files (have data written on them), fadvise(FADV_DONTNEED) will always trigger the unneeded remote cache draining. Link: http://lkml.kernel.org/r/20170818011023.181465-1-shakeelb@xxxxxxxxxx Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Hillf Danton <hillf.zj@xxxxxxxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Greg Thelen <gthelen@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/fadvise.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff -puN mm/fadvise.c~mm-fadvise-avoid-fadvise-for-fs-without-backing-device mm/fadvise.c --- a/mm/fadvise.c~mm-fadvise-avoid-fadvise-for-fs-without-backing-device +++ a/mm/fadvise.c @@ -52,7 +52,9 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, l goto out; } - if (IS_DAX(inode)) { + bdi = inode_to_bdi(mapping->host); + + if (IS_DAX(inode) || (bdi == &noop_backing_dev_info)) { switch (advice) { case POSIX_FADV_NORMAL: case POSIX_FADV_RANDOM: @@ -75,8 +77,6 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, l else endbyte--; /* inclusive */ - bdi = inode_to_bdi(mapping->host); - switch (advice) { case POSIX_FADV_NORMAL: f.file->f_ra.ra_pages = bdi->ra_pages; _ -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html