2016-08-24 16:04 GMT+09:00 Michal Hocko <mhocko@xxxxxxxxxx>: > On Wed 24-08-16 14:01:57, Joonsoo Kim wrote: >> Looks like my mail client eat my reply so I resend. >> >> On Tue, Aug 23, 2016 at 09:33:18AM +0200, Michal Hocko wrote: >> > On Tue 23-08-16 13:52:45, Joonsoo Kim wrote: >> > [...] >> > > Hello, Michal. >> > > >> > > I agree with partial revert but revert should be a different form. >> > > Below change try to reuse should_compact_retry() version for >> > > !CONFIG_COMPACTION but it turned out that it also causes regression in >> > > Markus report [1]. >> > >> > I would argue that CONFIG_COMPACTION=n behaves so arbitrary for high >> > order workloads that calling any change in that behavior a regression >> > is little bit exaggerated. Disabling compaction should have a very >> > strong reason. I haven't heard any so far. I am even wondering whether >> > there is a legitimate reason for that these days. >> > >> > > Theoretical reason for this regression is that it would stop retry >> > > even if there are enough lru pages. It only checks if freepage >> > > excesses min watermark or not for retry decision. To prevent >> > > pre-mature OOM killer, we need to keep allocation loop when there are >> > > enough lru pages. So, logic should be something like that. >> > > >> > > should_compact_retry() >> > > { >> > > for_each_zone_zonelist_nodemask { >> > > available = zone_reclaimable_pages(zone); >> > > available += zone_page_state_snapshot(zone, NR_FREE_PAGES); >> > > if (__zone_watermark_ok(zone, *0*, min_wmark_pages(zone), >> > > ac_classzone_idx(ac), alloc_flags, available)) >> > > return true; >> > > >> > > } >> > > } >> > > >> > > I suggested it before and current situation looks like it is indeed >> > > needed. >> > >> > this just opens doors for an unbounded reclaim/threshing becacause >> > you can reclaim as much as you like and there is no guarantee of a >> > forward progress. The reason why !COMPACTION should_compact_retry only >> > checks for the min_wmark without the reclaimable bias is that this will >> > guarantee a retry if we are failing due to high order wmark check rather >> > than a lack of memory. This condition is guaranteed to converge and the >> > probability of the unbounded reclaim is much more reduced. >> >> In case of a lack of memory with a lot of reclaimable lru pages, why >> do we stop reclaim/compaction? >> >> With your partial reverting patch, allocation logic would be like as >> following. >> >> Assume following situation: >> o a lot of reclaimable lru pages >> o no order-2 freepage >> o not enough order-0 freepage for min watermark >> o order-2 allocation >> >> 1. order-2 allocation failed due to min watermark >> 2. go to reclaim/compaction >> 3. reclaim some pages (maybe SWAP_CLUSTER_MAX (32) pages) but still >> min watermark isn't met for order-0 >> 4. compaction is skipped due to not enough freepage >> 5. should_reclaim_retry() returns false because min watermark for >> order-2 page isn't met >> 6. should_compact_retry() returns false because min watermark for >> order-0 page isn't met >> 6. allocation is failed without any retry and OOM is invoked. > > If the direct reclaim is not able to get us over min wmark for order-0 > then we would be likely to hit the oom even for order-0 requests. No, this situation is that direct reclaim can get us over min wmark for order-0 but it needs retry. IIUC, direct reclaim would not reclaim enough memory at once. It tries to reclaim small amount of lru pages and break out to check watermark. >> Is it what you want? >> >> And, please elaborate more on how your logic guarantee to converge. >> After order-0 freepage exceed min watermark, there is no way to stop >> reclaim/threshing. Number of freepage just increase monotonically and >> retry cannot be stopped until order-2 allocation succeed. Am I missing >> something? > > My statement was imprecise at best. You are right that there is no > guarantee to fullfil order-2 request. What I meant to say is that we > should converge when we are getting out of memory (aka even order-0 > would have hard time to succeed). should_reclaim_retry does that by > the back off scaling of the reclaimable pages. should_compact_retry > would have to do the same thing which would effectively turn it into > should_reclaim_retry. So, I suggested to change should_reclaim_retry() for high order request, before. >> > > And, I still think that your OOM detection rework has some flaws. >> > > >> > > 1) It doesn't consider freeable objects that can be freed by shrink_slab(). >> > > There are many subsystems that cache many objects and they will be >> > > freed by shrink_slab() interface. But, you don't account them when >> > > making the OOM decision. >> > >> > I fully rely on the reclaim and compaction feedback. And that is the >> > place where we should strive for improvements. So if we are growing way >> > too many slab objects we should take care about that in the slab reclaim >> > which is tightly coupled with the LRU reclaim rather than up the layer >> > in the page allocator. >> >> No. slab shrink logic which is tightly coupled with the LRU reclaim >> totally makes sense. > > Once the number of slab object is much larger than LRU pages (what we > have seen in some oom reports) then the way how they are coupled just > stops making a sense because the current approach no longer scales. We > might not have cared before because we used to retry blindly. At least > that is my understanding. If your logic guarantee to retry until number of lru pages are scanned, it would work well. It's not a problem of slab shrink. > I am sorry to skip large parts of your email but I believe those things > have been discussed and we would just repeat here. I full understand Okay. We discussed it several times and I'm also tired to discuss this topic. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>