On Wed, 05 Jan 2011, CAI Qian wrote: > > ----- Original Message ----- > > On Tue, 04 Jan 2011, CAI Qian wrote: > > > > > 1GB pages cannot be over-commited, attempting to do so results in > > > corruption, > > > so remove those files for simplicity. > > > > > > Symptoms: > > > 1) setup 1gb hugepages. > > > > > > cat /proc/cmdline > > > ...default_hugepagesz=1g hugepagesz=1g hugepages=1... > > > > > > cat /proc/meminfo > > > ... > > > HugePages_Total: 1 > > > HugePages_Free: 1 > > > HugePages_Rsvd: 0 > > > HugePages_Surp: 0 > > > Hugepagesize: 1048576 kB > > > ... > > > > > > 2) set nr_overcommit_hugepages > > > > > > echo 1 > > > >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages > > > cat > > > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages > > > 1 > > > > > > 3) overcommit 2gb hugepages. > > > > > > mmap(NULL, 18446744071562067968, PROT_READ|PROT_WRITE, MAP_SHARED, > > > 3, > > > 0) = -1 ENOMEM (Cannot allocate memory) > > > > > > cat > > > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages > > > 18446744071589420672 > > > > > > Signed-off-by: CAI Qian <caiqian@xxxxxxxxxx> > > > > There are a couple of issues here: first, I think the overcommit value > > being overwritten > > is a bug and this needs to be addressed and fixed before we cover it > > by removing the sysfs > > file. > > > > Second, will it be easier for userspace to work with some huge page > > sizes having the > > overcommit file and others not or making the kernel hand EINVAL back > > when nr_overcommit is > > is changed for an unsupported page size? > > > > Finally, this is a problem for more than 1GB pages on x86_64. It is > > true for all pages > > > 1 << MAX_ORDER. Once the overcommit bug is fixed and the second issue > > is answered, the > > solution that is used (either EINVAL or no overcommit file) needs to > > happen for all cases > > where it applies, not just the 1GB case. > I have a new patch ready to return EINVAL for both sysfs/procfs, and will > reject changing of nr_hugepages. Do you know if nr_hugepages_mempolicy > is supposed to be able to change in this case? It is not possible currently. > > # cat /proc/sys/vm/nr_hugepages_mempolicy > 1 > # echo 0 >/proc/sys/vm/nr_hugepages_mempolicy > # cat /proc/sys/vm/nr_hugepages_mempolicy > 1 nr_hugepages_mempolicy should follow all the same rules WRT MAX_ORDER as nr_hugepages. The difference is nr_hugepages_mempolicy respects the NUMA allocation policy that is set. I have a pair of patches that do about the same thing but instead of altering flush_write_buffer, they make the functions that use strict_strtoul in hugetlb.c return -EINVAL on error instead of 0. The second patch is the same as your check for MAX_ORDER. I think that returning -EINVAL from hugetlb.c makes better sense than changing the behavior of flush_write_buffer. Patches will be on the way as soon as I am sure they build.
Attachment:
signature.asc
Description: Digital signature