On Tue 08-09-20 10:05:11, Zi Yan wrote: > On 8 Sep 2020, at 7:57, David Hildenbrand wrote: > > > On 03.09.20 18:30, Roman Gushchin wrote: > >> On Thu, Sep 03, 2020 at 05:23:00PM +0300, Kirill A. Shutemov wrote: > >>> On Wed, Sep 02, 2020 at 02:06:12PM -0400, Zi Yan wrote: > >>>> From: Zi Yan <ziy@xxxxxxxxxx> > >>>> > >>>> Hi all, > >>>> > >>>> This patchset adds support for 1GB THP on x86_64. It is on top of > >>>> v5.9-rc2-mmots-2020-08-25-21-13. > >>>> > >>>> 1GB THP is more flexible for reducing translation overhead and increasing the > >>>> performance of applications with large memory footprint without application > >>>> changes compared to hugetlb. > >>> > >>> This statement needs a lot of justification. I don't see 1GB THP as viable > >>> for any workload. Opportunistic 1GB allocation is very questionable > >>> strategy. > >> > >> Hello, Kirill! > >> > >> I share your skepticism about opportunistic 1 GB allocations, however it might be useful > >> if backed by an madvise() annotations from userspace application. In this case, > >> 1 GB THPs might be an alternative to 1 GB hugetlbfs pages, but with a more convenient > >> interface. > > > > I have concerns if we would silently use 1~GB THPs in most scenarios > > where be would have used 2~MB THP. I'd appreciate a trigger to > > explicitly enable that - MADV_HUGEPAGE is not sufficient because some > > applications relying on that assume that the THP size will be 2~MB > > (especially, if you want sparse, large VMAs). > > This patchset is not intended to silently use 1GB THP in place of 2MB THP. > First of all, there is a knob /sys/kernel/mm/transparent_hugepage/enable_1GB > to enable 1GB THP explicitly. Also, 1GB THP is allocated from a reserved CMA > region (although I had alloc_contig_pages as a fallback, which can be removed > in next version), so users need to add hugepage_cma=nG kernel parameter to > enable 1GB THP allocation. If a finer control is necessary, we can add > a new MADV_HUGEPAGE_1GB for 1GB THP. A global knob is insufficient. 1G pages will become a very precious resource as it requires a pre-allocation (reservation). So it really has to be an opt-in and the question is whether there is also some sort of access control needed. -- Michal Hocko SUSE Labs