On 07/28/20 at 09:46am, Mike Kravetz wrote: > On 7/28/20 6:24 AM, Baoquan He wrote: > > Hi Muchun, > > > > On 07/28/20 at 11:49am, Muchun Song wrote: > >> In the reservation routine, we only check whether the cpuset meets > >> the memory allocation requirements. But we ignore the mempolicy of > >> MPOL_BIND case. If someone mmap hugetlb succeeds, but the subsequent > >> memory allocation may fail due to mempolicy restrictions and receives > >> the SIGBUS signal. This can be reproduced by the follow steps. > >> > >> 1) Compile the test case. > >> cd tools/testing/selftests/vm/ > >> gcc map_hugetlb.c -o map_hugetlb > >> > >> 2) Pre-allocate huge pages. Suppose there are 2 numa nodes in the > >> system. Each node will pre-allocate one huge page. > >> echo 2 > /proc/sys/vm/nr_hugepages > >> > >> 3) Run test case(mmap 4MB). We receive the SIGBUS signal. > >> numactl --membind=0 ./map_hugetlb 4 > > > > I think supporting the mempolicy of MPOL_BIND case is a good idea. > > I am wondering what about the other mempolicy cases, e.g MPOL_INTERLEAVE, > > MPOL_PREFERRED. Asking these because we already have similar handling in > > sysfs, proc nr_hugepages_mempolicy writting. Please see > > __nr_hugepages_store_common() for detail. > > There is a high level difference in the function of this code and the code > called by the sysfs and proc interfaces. This patch is dealing with reserving > huge pages in the pool for later use. The sysfs and proc interfaces are > allocating huge pages to be added to the pool. > > Using mempolicy to decide how to allocate huge pages is pretty straight > forward. Using mempolicy to reserve pages is almost impossible to get > correct. The comment at the beginning of hugetlb_acct_memory() and modified > by this patch summarizes the issues. > > IMO, at this time it makes little sense to perform checks for more than > MPOL_BIND at reservation time. If we ever take on the monumental task of > supporting mempolicy directed per-node reservations throughout the life of > a process, support for other policies will need to be taken into account. I haven't figured out the difficulty of using mempolicy very clearly, will read more codes and digest and understand your words. Thanks a lot for these details. Thanks Baoquan