Our users reported that there're some random latency spikes when their RT process is running. Finally we found that latency spike is caused by FADV_DONTNEED. Which may call lru_add_drain_all() to drain LRU cache on remote CPUs, and then waits the per-cpu work to complete. The wait time is uncertain, which may be tens millisecond. That behavior is unreasonable, because this process is bound to a specific CPU and the file is only accessed by itself, IOW, there should be no pagecache pages on a per-cpu pagevec of a remote CPU. That unreasonable behavior is partially caused by the wrong comparation of the number of invalidated pages and the number of the target. For example, if (count < (end_index - start_index + 1)) The count above is how many pages were invalidated in the local CPU, and (end_index - start_index + 1) is how many pages should be invalidated. The usage of (end_index - start_index + 1) is incorrect, because they are virtual addresses, which may not mapped to pages. We'd better use inode->i_data.nrpages as the target. Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> --- mm/fadvise.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/fadvise.c b/mm/fadvise.c index 0e66f2aaeea3..ec25c91194a3 100644 --- a/mm/fadvise.c +++ b/mm/fadvise.c @@ -163,7 +163,7 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) * a per-cpu pagevec for a remote CPU. Drain all * pagevecs and try again. */ - if (count < (end_index - start_index + 1)) { + if (count < inode->i_data.nrpages) { lru_add_drain_all(); invalidate_mapping_pages(mapping, start_index, end_index); -- 2.17.1