Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eiichi,

I agree with Michal's points, and I think there are also some other design questions which don't quite make sense to me. Perhaps you can clear them up? :-)

Eiichi Tsukata writes:
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4bdb58ab14cb..e2d57200fd00 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1726,8 +1726,8 @@ static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
 * balanced over allowed nodes.
 * Called with hugetlb_lock locked.
 */
-static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
-							 bool acct_surplus)
+int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
+			bool acct_surplus)
{
	int nr_nodes, node;
	int ret = 0;

The immediate red flag to me is that we're investing further mm knowledge into hugetlb. For the vast majority of intents and purposes, hugetlb exists outside of the typical memory management lifecycle, and historic behaviour has been to treat a separate reserve that we don't touch. We expect that hugetlb is a reserve which is by and large explicitly managed by the system administrator, not by us, and this seems to violate that.

Shoehorning in shrink-on-OOM support to it seems a little suspicious to me, because we already have a modernised system for huge pages that handles not only this, but many other memory management situations: THP. THP not only has support for this particular case, but so many other features which are necessary to coherently manage it as part of the mm lifecycle. For that reason, I'm not convinced that those composes to a sensible interface.

As some example questions which appear unresolved to me: if hugetlb pages are lost, what mechanisms will we provide to tell automation or the system administrator what to do in that scenario? How should the interface for resolving hugepage starvation due to repeated OOMs look? By what metrics will you decide if releasing the hugepage is worse for the system than selecting a victim for OOM? Why can't the system use the existing THP mechanisms to resolve this ahead of time?

Thanks,

Chris



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux