On Tue, Sep 22, 2020 at 3:23 PM Mel Gorman <mgorman@xxxxxxx> wrote: > > On Tue, Sep 22, 2020 at 10:12:31AM +0800, Yafang Shao wrote: > > On Tue, Sep 22, 2020 at 6:34 AM Mel Gorman <mgorman@xxxxxxx> wrote: > > > > > > On Mon, Sep 21, 2020 at 09:43:17AM +0800, Yafang Shao wrote: > > > > Our users reported that there're some random latency spikes when their RT > > > > process is running. Finally we found that latency spike is caused by > > > > FADV_DONTNEED. Which may call lru_add_drain_all() to drain LRU cache on > > > > remote CPUs, and then waits the per-cpu work to complete. The wait time > > > > is uncertain, which may be tens millisecond. > > > > That behavior is unreasonable, because this process is bound to a > > > > specific CPU and the file is only accessed by itself, IOW, there should > > > > be no pagecache pages on a per-cpu pagevec of a remote CPU. That > > > > unreasonable behavior is partially caused by the wrong comparation of the > > > > number of invalidated pages and the number of the target. For example, > > > > if (count < (end_index - start_index + 1)) > > > > The count above is how many pages were invalidated in the local CPU, and > > > > (end_index - start_index + 1) is how many pages should be invalidated. > > > > The usage of (end_index - start_index + 1) is incorrect, because they > > > > are virtual addresses, which may not mapped to pages. We'd better use > > > > inode->i_data.nrpages as the target. > > > > > > > > > > How does that work if the invalidation is for a subset of the file? > > > > > > > I realized it as well. There are some solutions to improve it. > > > > Option 1, take the min as the target. > > - if (count < (end_index - start_index + 1)) { > > + target = min_t(unsigned long, inode->i_data.nrpages, > > + end_index - start_index + 1); > > + if (count < target) { > > lru_add_drain_all(); > > > > Option 2, change the prototype of invalidate_mapping_pages and then > > check how many pages were skipped. > > > > + struct invalidate_stat { > > + unsigned long skipped; // how many pages were skipped > > + unsigned long invalidated; // how many pages were invalidated > > +}; > > > > - unsigned long invalidate_mapping_pages(struct address_space *mapping, > > +unsigned long invalidate_mapping_pages(struct address_space *mapping, > > struct invalidate_stat *stat, > > > > That would involve updating each caller and the struct is > unnecessarily heavy. Create one that returns via **nr_lruvec. For > invalidate_mapping_pages, pass in NULL as nr_lruvec. Create a new helper > for fadvise that accepts nr_lruvec. In the common helper, account for pages > that are likely on an LRU and count them in nr_lruvec if !NULL. Update > fadvise to drain only if pages were skipped that were on the lruvec. That > should also deal with the case where holes have been punched between > start and end. > Good suggestion, thanks Mel. I will send v2. -- Thanks Yafang