On Tue, 3 Mar 2015, Mike Kravetz wrote: > hugetlbfs allocates huge pages from the global pool as needed. Even if > the global pool contains a sufficient number pages for the filesystem > size at mount time, those global pages could be grabbed for some other > use. As a result, filesystem huge page allocations may fail due to lack > of pages. > > Applications such as a database want to use huge pages for performance > reasons. hugetlbfs filesystem semantics with ownership and modes work > well to manage access to a pool of huge pages. However, the application > would like some reasonable assurance that allocations will not fail due > to a lack of huge pages. At application startup time, the application > would like to configure itself to use a specific number of huge pages. > Before starting, the application will can check to make sure that enough > huge pages exist in the system global pools. What the application wants > is exclusive use of a subpool of huge pages. > > Add a new hugetlbfs mount option 'reserved' to specify that the number > of pages associated with the size of the filesystem will be reserved. If > there are insufficient pages, the mount will fail. The reservation is > maintained for the duration of the filesystem so that as pages are > allocated and free'ed a sufficient number of pages remains reserved. > This functionality is somewhat limited because it's not possible to reserve a subset of the size for a single mount point, it's either all or nothing. It shouldn't be too difficult to just add a reserved=<value> option where <value> is <= size. If it's done that way, you should be able to omit size= entirely for unlimited hugepages but always ensure that a low watermark of hugepages are reserved for the database. > Comments from RFC addressed/incorporated > > Mike Kravetz (4): > hugetlbfs: add reserved mount fields to subpool structure > hugetlbfs: coordinate global and subpool reserve accounting > hugetlbfs: accept subpool reserved option and setup accordingly > hugetlbfs: document reserved mount option > > Documentation/vm/hugetlbpage.txt | 18 ++++++++------ > fs/hugetlbfs/inode.c | 15 ++++++++++-- > include/linux/hugetlb.h | 7 ++++++ > mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++------- > 4 files changed, 75 insertions(+), 18 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>