On Wed, 18 Apr 2018 05:08:13 +0800 Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> wrote: > Since tmpfs THP was supported in 4.8, hugetlbfs is not the only > filesystem with huge page support anymore. tmpfs can use huge page via > THP when mounting by "huge=" mount option. > > When applications use huge page on hugetlbfs, it just need check the > filesystem magic number, but it is not enough for tmpfs. So, introduce > ST_HUGE flag to statfs if super block has SB_HUGE set which indicates > huge page is supported on the specific filesystem. > > Some applications could benefit from this change, for example QEMU. > When use mmap file as guest VM backend memory, QEMU typically mmap the > file size plus one extra page. If the file is on hugetlbfs the extra > page is huge page size (i.e. 2MB), but it is still 4KB on tmpfs even > though THP is enabled. tmpfs THP requires VMA is huge page aligned, so > if 4KB page is used THP will not be used at all. The below /proc/meminfo > fragment shows the THP use of QEMU with 4K page: > > ShmemHugePages: 679936 kB > ShmemPmdMapped: 0 kB > > With ST_HUGE flag, QEMU can get huge page, then /proc/meminfo looks > like: > > ShmemHugePages: 77824 kB > ShmemPmdMapped: 6144 kB > > With this flag, the applications can know if huge page is supported on > the filesystem then optimize the behavior of the applications > accordingly. Although the similar function can be implemented in > applications by traversing the mount options, it looks more convenient > if kernel can provide such flag. > > Even though ST_HUGE is set, f_bsize still returns 4KB for tmpfs since > THP could be split, and it also my fallback to 4KB page silently if > there is not enough huge page. > > And, set the flag for hugetlbfs as well to keep the consistency, and the > applications don't have to know what filesystem is used to use huge > page, just need to check ST_HUGE flag. > Patch is simple enough, although I'm having trouble forming an opinion about it ;) It will call for an update to the statfs(2) manpage. I'm not sure which of linux-man@xxxxxxxxxxxxxxx, mtk.manpages@xxxxxxxxx and linux-api@xxxxxxxxxxxxxxx is best for that, so I'd cc all three...