On Tue, Aug 6, 2019 at 7:09 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Tue 06-08-19 18:59:52, Yafang Shao wrote: > > On Tue, Aug 6, 2019 at 6:28 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Tue 06-08-19 17:54:02, Yafang Shao wrote: > > > > On Tue, Aug 6, 2019 at 5:50 PM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > On Tue, Aug 06, 2019 at 11:25:31AM +0200, Michal Hocko wrote: > > > > > > On Tue 06-08-19 17:15:05, Yafang Shao wrote: > > > > > > > On Tue, Aug 6, 2019 at 5:05 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > [...] > > > > > > > > > As you said, the direct reclaim path set it to 1, but the > > > > > > > > > __node_reclaim() forgot to process may_shrink_slab. > > > > > > > > > > > > > > > > OK, I am blind obviously. Sorry about that. Anyway, why cannot we simply > > > > > > > > get back to the original behavior by setting may_shrink_slab in that > > > > > > > > path as well? > > > > > > > > > > > > > > You mean do it as the commit 0ff38490c836 did before ? > > > > > > > I haven't check in which commit the shrink_slab() is removed from > > > > > > > > > > > > What I've had in mind was essentially this: > > > > > > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > index 7889f583ced9..8011288a80e2 100644 > > > > > > --- a/mm/vmscan.c > > > > > > +++ b/mm/vmscan.c > > > > > > @@ -4088,6 +4093,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in > > > > > > .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP), > > > > > > .may_swap = 1, > > > > > > .reclaim_idx = gfp_zone(gfp_mask), > > > > > > + .may_shrinkslab = 1; > > > > > > }; > > > > > > > > > > > > trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order, > > > > > > > > > > > > shrink_node path already does shrink slab when the flag allows that. In > > > > > > other words get us back to before 1c30844d2dfe because that has clearly > > > > > > changed the long term node reclaim behavior just recently. > > > > > > > > > > I'd be fine with this change. It was not intentional to significantly > > > > > change the behaviour of node reclaim in that patch. > > > > > > > > > > > > > But if we do it like this, there will be bug in the knob vm.min_slab_ratio. > > > > Right ? > > > > > > Yes, and the answer for that is a question why do we even care? Which > > > real life workload does suffer from the of min_slab_ratio misbehavior. > > > Also it is much more preferred to fix an obvious bug/omission which > > > lack of may_shrinkslab in node reclaim seem to be than a larger rewrite > > > with a harder to see changes. > > > > > > > Fixing the bug in min_slab_ratio doesn't require much change, as it > > just introduce a new bit in scan_control which doesn't require more > > space > > and a if-branch in shrink_node() which doesn't take much cpu cycles > > neither, and it will not take much maintaince neither as no_pagecache > > is 0 by default and then we don't need to worry about what if we > > forget it. > > You are still missing my point, I am afraid. I am not saying your change > is wrong or complex. I am saying that there is an established behavior > (even when wrong) that node-reclaim dependent loads might depend on. > Your testing doesn't really suggest you have done much testing beyond > the targeted one which is quite artificial to say the least. > > Maybe there are workloads which do depend on proper min_slab_ratio > behavior but it would be much more preferable to hear from them rather > than change the behavior based on the code inspection and a > microbenchmark. > > Is my thinking more clear now? > Thanks for your clarification. I get your point now. Thanks Yafang