On Thu, Nov 12, 2020 at 09:15:18PM +0100, David Hildenbrand wrote: > > > Am 12.11.2020 um 20:08 schrieb Mike Rapoport <rppt@xxxxxxxxxx>: > > > > On Thu, Nov 12, 2020 at 05:22:00PM +0100, David Hildenbrand wrote: > >>> On 10.11.20 19:06, Mike Rapoport wrote: > >>> On Tue, Nov 10, 2020 at 06:17:26PM +0100, David Hildenbrand wrote: > >>>> On 10.11.20 16:14, Mike Rapoport wrote: > >>>>> From: Mike Rapoport <rppt@xxxxxxxxxxxxx> > >>>>> > >>>>> It will be used by the upcoming secret memory implementation. > >>>>> > >>>>> Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > >>>>> --- > >>>>> mm/internal.h | 3 +++ > >>>>> mm/mmap.c | 5 ++--- > >>>>> 2 files changed, 5 insertions(+), 3 deletions(-) > >>>>> > >>>>> diff --git a/mm/internal.h b/mm/internal.h > >>>>> index c43ccdddb0f6..ae146a260b14 100644 > >>>>> --- a/mm/internal.h > >>>>> +++ b/mm/internal.h > >>>>> @@ -348,6 +348,9 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma) > >>>>> extern void mlock_vma_page(struct page *page); > >>>>> extern unsigned int munlock_vma_page(struct page *page); > >>>>> +extern int mlock_future_check(struct mm_struct *mm, unsigned long flags, > >>>>> + unsigned long len); > >>>>> + > >>>>> /* > >>>>> * Clear the page's PageMlocked(). This can be useful in a situation where > >>>>> * we want to unconditionally remove a page from the pagecache -- e.g., > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c > >>>>> index 61f72b09d990..c481f088bd50 100644 > >>>>> --- a/mm/mmap.c > >>>>> +++ b/mm/mmap.c > >>>>> @@ -1348,9 +1348,8 @@ static inline unsigned long round_hint_to_min(unsigned long hint) > >>>>> return hint; > >>>>> } > >>>>> -static inline int mlock_future_check(struct mm_struct *mm, > >>>>> - unsigned long flags, > >>>>> - unsigned long len) > >>>>> +int mlock_future_check(struct mm_struct *mm, unsigned long flags, > >>>>> + unsigned long len) > >>>>> { > >>>>> unsigned long locked, lock_limit; > >>>>> > >>>> > >>>> So, an interesting question is if you actually want to charge secretmem > >>>> pages against mlock now, or if you want a dedicated secretmem cgroup > >>>> controller instead? > >>> > >>> Well, with the current implementation there are three limits an > >>> administrator can use to control secretmem limits: mlock, memcg and > >>> kernel parameter. > >>> > >>> The kernel parameter puts a global upper limit for secretmem usage, > >>> memcg accounts all secretmem allocations, including the unused memory in > >>> large pages caching and mlock allows per task limit for secretmem > >>> mappings, well, like mlock does. > >>> > >>> I didn't consider a dedicated cgroup, as it seems we already have enough > >>> existing knobs and a new one would be unnecessary. > >> > >> To me it feels like the mlock() limit is a wrong fit for secretmem. But > >> maybe there are other cases of using the mlock() limit without actually > >> doing mlock() that I am not aware of (most probably :) )? > > > > Secretmem does not explicitly calls to mlock() but it does what mlock() > > does and a bit more. Citing mlock(2): > > > > mlock(), mlock2(), and mlockall() lock part or all of the calling > > process's virtual address space into RAM, preventing that memory from > > being paged to the swap area. > > > > So, based on that secretmem pages are not swappable, I think that > > RLIMIT_MEMLOCK is appropriate here. > > > > The page explicitly lists mlock() system calls. Well, it's mlock() man page, isn't it? ;-) My thinking was that since secretmem does what mlock() does wrt swapability, it should at least obey the same limit, i.e. RLIMIT_MEMLOCK. > E.g., we also don‘t > account for gigantic pages - which might be allocated from CMA and are > not swappable. Do you mean gigantic pages in hugetlbfs? It seems to me that hugetlbfs accounting is a completely different story. > >> I mean, my concern is not earth shattering, this can be reworked later. As I > >> said, it just feels wrong. > >> > >> -- > >> Thanks, > >> > >> David / dhildenb > >> > > > > -- > > Sincerely yours, > > Mike. > > > -- Sincerely yours, Mike.