On Fri, Jul 30, 2021 at 12:48:33AM -0700, Hugh Dickins wrote: > Add support for fcntl(fd, F_HUGEPAGE) and fcntl(fd, F_NOHUGEPAGE), to > select hugeness per file: useful to override the default hugeness of the > shmem mount, when occasionally needing to store a hugepage file in a > smallpage mount or vice versa. Hm. But why is the new MFD_* needed if the fcntl() can do the same. > These fcntls just specify whether or not to try for huge pages when > allocating to the object later: F_HUGEPAGE does not touch small pages > already allocated (though khugepaged may do so when the file is mapped > afterwards), F_NOHUGEPAGE does not split huge pages already allocated. > > Why fcntl? Because it's already in use (for sealing) on memfds; and I'm > anxious to keep this simple, just applying it to whole files: fallocate, > madvise and posix_fadvise each involve a range, which would need a new > kind of tree attached to the inode for proper support. Most of fadvise() operations ignore the range. I like fadvise() because it's less prescriptive: kernel is free to ignore it. -- Kirill A. Shutemov