On Mon, Apr 27, 2020 at 7:24 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Mon 27-04-20 19:06:52, Yafang Shao wrote: > > On Mon, Apr 27, 2020 at 6:50 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Mon 27-04-20 18:09:27, Yafang Shao wrote: > > > > On Mon, Apr 27, 2020 at 5:40 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > > > > > On Sat 25-04-20 11:24:18, Yafang Shao wrote: > > > > > > Since proportional memory.{min, low} reclaim is introduced in > > > > > > commit 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim"), > > > > > > it have been proved that the proportional reclaim is hard to understand and > > > > > > the issues caused by it is harder to understand.[1]. That dilemma faced by > > > > > > us is caused by that the proportional reclaim mixed up memcg and the > > > > > > reclaim context. > > > > > > > > > > > > In proportional reclaim, the whole reclaim context - includes the memcg > > > > > > to be reclaimed and the reclaimer, should be considered, rather than > > > > > > memcg only. > > > > > > > > > > > > To make it clear, a new member 'protection' is introduced in the reclaim > > > > > > context (struct shrink_control) to replace mem_cgroup_protection(). This > > > > > > > > > > s@shrink_control@scan_control@ > > > > > > > > > > > > > Thanks for pointing this out. > > > > > > > > > > one is set when we check whether the memcg is protected or not. > > > > > > > > > > > > After this change, the issue pointed by me[1] - a really old left-over > > > > > > value can slow donw target reclaim - can be fixed, and I think it could > > > > > > also avoid some potential race. > > > > > > > > > > The patch would have been much esier to review if you only focused on > > > > > the effective protection value caching. I really fail to see why you had > > > > > to make mem_cgroup_protected even more convoluted with more side effects > > > > > (e.g. sc->memcg_low_skipped). This goes directly opposite to what > > > > > Johannes was proposing in other email AFAICS. > > > > > > > > > > > > > Sorry, I failed to see what the advantage of Johannes's proposal > > > > except the better naming. > > > > > > The immediate advantage is that predicate should better not have any > > > side effect. > > > > No, it still has side effect. That is not an immediate advantage neither. > > See bellow, > > I believe you have misunderstood the proposal. > mem_cgroup_below_{min,low,protection} won't have any side effect on the > memcg. It would be only mem_cgroup_calculate_protection which updates > the cached state. So there won't be any predicate to have side effect. > Maybe I misunderstood this porposal, or both of us misunderstood this proposal. > > > So splitting into the calculation part which has clearly > > > defined side effects and having a simple predicate that consults > > > pre-calculated values makes a lot of sense to me. > > > > > > > When you calculate the effective values, the source values to > > calculate it may be changed with these values when you do the > > predicate. > > IOW, this proposal greatly increase the race window. > > Why do you think so? > See above, I don't think it is proper to disccuss it any more until we see the code. > > > > > Your changelog doesn't explain why double caching the effective value is > > > > > an improvement. > > > > > > > > The improvement is, to avoid getting an wrong value calculated by > > > > other reclaimers and avoid issues in mem_cgroup_protection() that we > > > > haven't noticed. > > > > > > This is not true in general. There is still parallel calculation done > > > and so parallel reclaimers might affect each other. Your patch only > > > makes a real difference for leaf memcgs which are the reclaim target as > > > well. > > > > The race between parallel reclaimers is fundamentally exist, and we > > can do now is reducing the race window as far as possible. > > Reducing the race window is futile. The situation changes after each > reclaim attempt. All we need is to keep a protection consistent > regardless of where the reclaim root. That means that hierarchies deeper > in the tree cannot override the protection from those which are higher. > That is what I have implemented in this patchset. > > > All intermediate nodes really do not care about the cached values > > > because they do not have any pages on the LRU lists. > > > > > > > But read these cached values can save lot of time against the existing > > code, as the existing code still calculates them even if they are > > useless. > > They are not useless. Intermediate values are necessary for the > protection distribution for lower level memcgs. > -- Thanks Yafang