Minchan Kim <minchan@xxxxxxxxxx> writes: > On Tue, Sep 12, 2017 at 03:29:45PM +0800, Huang, Ying wrote: >> Minchan Kim <minchan@xxxxxxxxxx> writes: >> >> > On Tue, Sep 12, 2017 at 02:44:36PM +0800, Huang, Ying wrote: >> >> Minchan Kim <minchan@xxxxxxxxxx> writes: >> >> >> >> > On Tue, Sep 12, 2017 at 01:23:01PM +0800, Huang, Ying wrote: >> >> >> Minchan Kim <minchan@xxxxxxxxxx> writes: >> >> >> >> >> >> > page_cluster 0 means "we don't want readahead" so in the case, >> >> >> > let's skip the readahead detection logic. >> >> >> > >> >> >> > Cc: "Huang, Ying" <ying.huang@xxxxxxxxx> >> >> >> > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> >> >> >> > --- >> >> >> > include/linux/swap.h | 3 ++- >> >> >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> >> >> > >> >> >> > diff --git a/include/linux/swap.h b/include/linux/swap.h >> >> >> > index 0f54b491e118..739d94397c47 100644 >> >> >> > --- a/include/linux/swap.h >> >> >> > +++ b/include/linux/swap.h >> >> >> > @@ -427,7 +427,8 @@ extern bool has_usable_swap(void); >> >> >> > >> >> >> > static inline bool swap_use_vma_readahead(void) >> >> >> > { >> >> >> > - return READ_ONCE(swap_vma_readahead) && !atomic_read(&nr_rotate_swap); >> >> >> > + return page_cluster > 0 && READ_ONCE(swap_vma_readahead) >> >> >> > + && !atomic_read(&nr_rotate_swap); >> >> >> > } >> >> >> > >> >> >> > /* Swap 50% full? Release swapcache more aggressively.. */ >> >> >> >> >> >> Now the readahead window size of the VMA based swap readahead is >> >> >> controlled by /sys/kernel/mm/swap/vma_ra_max_order, while that of the >> >> >> original swap readahead is controlled by sysctl page_cluster. It is >> >> >> possible for anonymous memory to use VMA based swap readahead and tmpfs >> >> >> to use original swap readahead algorithm at the same time. So that, I >> >> >> think it is necessary to use different control knob to control these two >> >> >> algorithm. So if we want to disable readahead for tmpfs, but keep it >> >> >> for VMA based readahead, we can set 0 to page_cluster but non-zero to >> >> >> /sys/kernel/mm/swap/vma_ra_max_order. With your change, this will be >> >> >> impossible. >> >> > >> >> > For a long time, page-cluster have been used as controlling swap readahead. >> >> > One of example, zram users have been disabled readahead via 0 page-cluster. >> >> > However, with your change, it would be regressed if it doesn't disable >> >> > vma_ra_max_order. >> >> > >> >> > As well, all of swap users should be aware of vma_ra_max_order as well as >> >> > page-cluster to control swap readahead but I didn't see any document about >> >> > that. Acutaully, I don't like it but want to unify it with page-cluster. >> >> >> >> The document is in >> >> >> >> Documentation/ABI/testing/sysfs-kernel-mm-swap >> >> >> >> The concern of unifying it with page-cluster is as following. >> >> >> >> Original swap readahead on tmpfs may not work well because the combined >> >> workload is running, so we want to disable or constrain it. But at the >> >> same time, the VMA based swap readahead may work better. So I think it >> >> may be necessary to control them separately. >> > >> > My concern is users have been disabled swap readahead by page-cluster would >> > be regressed. Please take care of them. >> >> How about disable VMA based swap readahead if zram used as swap? Like >> we have done for hard disk? > > It could be with SWP_SYNCHRONOUS_IO flag which indicates super-fast, > no seek cost swap devices if this patchset is merged so VM automatically > disables readahead. It is in my TODO but it's orthogonal work. > > The problem I raised is "Why shouldn't we obey user's decision?", > not zram sepcific issue. > > A user has used SSD as swap devices decided to disable swap readahead > by some reason(e.g., small memory system). Anyway, it has worked > via page-cluster for a several years but with vma-based swap devices, > it doesn't work any more. Can they add one more line to their configuration scripts? echo 0 > /sys/kernel/mm/swap/vma_ra_max_order Best Regards, Huang, Ying -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>