On Fri, 28 Jan 2011 17:25:58 +0900 Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > Hi Hannes, > > On Fri, Jan 28, 2011 at 5:17 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Fri, Jan 28, 2011 at 05:04:16PM +0900, Minchan Kim wrote: > >> Hi Kame, > >> > >> On Fri, Jan 28, 2011 at 1:58 PM, KAMEZAWA Hiroyuki > >> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > >> > How about this ? > >> > == > >> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > >> > > >> > Current memory cgroup's code tends to assume page_size == PAGE_SIZE > >> > and arrangement for THP is not enough yet. > >> > > >> > This is one of fixes for supporing THP. This adds > >> > mem_cgroup_check_margin() and checks whether there are required amount of > >> > free resource after memory reclaim. By this, THP page allocation > >> > can know whether it really succeeded or not and avoid infinite-loop > >> > and hangup. > >> > > >> > Total fixes for do_charge()/reclaim memory will follow this patch. > >> > >> If this patch is only related to THP, I think patch order isn't good. > >> Before applying [2/4], huge page allocation will retry without > >> reclaiming and loop forever by below part. > >> > >> @@ -1854,9 +1858,6 @@ static int __mem_cgroup_do_charge(struct > >> Â Â Â } else > >> Â Â Â Â Â Â Â mem_over_limit = mem_cgroup_from_res_counter(fail_res, res); > >> > >> - Â Â if (csize > PAGE_SIZE) /* change csize and retry */ > >> - Â Â Â Â Â Â return CHARGE_RETRY; > >> - > >> Â Â Â if (!(gfp_mask & __GFP_WAIT)) > >> Â Â Â Â Â Â Â return CHARGE_WOULDBLOCK; > >> > >> Am I missing something? > > > > No, you are correct. ÂBut I am not sure the order really matters in > > theory: you have two endless loops that need independent fixing. > > That's why I ask a question. > Two endless loop? > > One is what I mentioned. The other is what? > Maybe this patch solve the other. > But I can't guess it by only this description. Stupid.. > > Please open my eyes. > One is. if (csize > PAGE_SIZE) return CHARGE_RETRY; By this, reclaim will never be called. Another is a check after memory reclaim. == ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, NULL, gfp_mask, flags); /* * try_to_free_mem_cgroup_pages() might not give us a full * picture of reclaim. Some pages are reclaimed and might be * moved to swap cache or just unmapped from the cgroup. * Check the limit again to see if the reclaim reduced the * current usage of the cgroup before giving up */ if (ret || mem_cgroup_check_under_limit(mem_over_limit)) return CHARGE_RETRY; == ret != 0 if one page is reclaimed. Then, khupaged will retry charge and cannot get enough room, reclaim, one page -> again. SO, in busy memcg, HPAGE_SIZE allocation never fails. Even if khupaged luckly allocates HPAGE_SIZE, because khugepaged walks vmas one by one and try to collapse each pmd, under mmap_sem(), this seems a hang by khugepaged, infinite loop. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>