On Wed, 1 Jul 2020, Dave Hansen wrote: > On 7/1/20 1:04 PM, Ben Widawsky wrote: > >> +static inline bool node_reclaim_enabled(void) > >> +{ > >> + /* Is any node_reclaim_mode bit set? */ > >> + return node_reclaim_mode & (RECLAIM_ZONE|RECLAIM_WRITE|RECLAIM_UNMAP); > >> +} > >> + > >> extern void check_move_unevictable_pages(struct pagevec *pvec); > >> > >> extern int kswapd_run(int nid); > > If a user writes a bit that isn't a RECLAIM_* bit to vm.zone_reclaim_mode > > today, it acts as though RECLAIM_ZONE is enabled: we try to reclaim in > > zonelist order before falling back to the next zone in the page allocator. > > The sysctl doesn't enforce any max value :/ I dont know if there is any > > such user, but this would break them if there is. > > > > Should this simply be return !!node_reclaim_mode? > > You're right that there _could_ be a user-visible behavior change here. > But, if there were a change it would be for a bit which wasn't even > mentioned in the documentation. Somebody would have had to look at the > doc mentioning 1,2,4 and written an 8. If they did that, they're asking > for trouble because we could have defined the '8' bit to do nasty things > like auto-demote all your memory. :) > > I'll mention it in the changelog, but I still think we should check the > actual, known bits rather than check for 0. > > BTW, in the hardware, they almost invariably make unused bits "reserved" > and do mean things like #GP if someone tries to set them. This is a > case where the kernel probably should have done the same. It would have > saved us the trouble of asking these questions now. Maybe we should > even do that going forward. > Maybe enforce it in a sysctl handler so the user catches any errors, which would be better than silently accepting some policy that doesn't exist? RECLAIM_UNMAP and/or RECLAIM_WRITE should likely get -EINVAL if attempted to be set without RECLAIM_ZONE as well: they are no-ops without RECLAIM_ZONE. This would likely have caught something wrong with commit 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE") if it would have already been in place. I don't feel strongly about this, so feel free to ignore.