Re: + mm-thp-always-specify-disabled-vmas-as-nh-in-smaps.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 29 Dec 2018, Kirill A. Shutemov wrote:

> > > > Smaps reveals the state of each vma at the moment it is read, not the 
> > > > future, this has always been the case before and after the broken commit; 
> > > > it's not intended to be something that can specify "thp is currently 
> > > > disabled, but if we did prctl(PR_SET_THP_DISABLE, 0), this vma has done 
> > > > madvise(MADV_HUGEPAGE), and we are on a thp enabled == madvise host, so 
> > > > it would be eligible for thp."
> > > 
> > > The point of CRIU is that it tries to save and re-create the state of the
> > > process.
> > > 
> > > With your fix, "nh" in VMA flags can mean two different things:
> > >  - MADV_NOHUGEPAGE was called for the VMA, or
> > >  - prctl(PR_SET_THP_DISABLE) was called for the whole process
> > > 
> > > Current CRIU code assumes that "nh" means MADV_NOHUGEPAGE. But if it was
> > > due to prctl(PR_SET_THP_DISABLE) the process state will not be restored
> > > correctly: all VMAs will be restored with MADV_NOHUGEPAGE and future
> > > re-enabling THP with prctl(PR_SET_THP_DISABLE) for restored process will
> > > not have any effect. VMAs stay not eligble for THP.
> > > 
> > 
> > CRIU's usage of "nh" was created specifically for PR_SET_THP_DISABLE:
> > 
> > commit 9f61de87768b33d9b187ef71a57262a17276ff3b
> > 
> > proc_parse: make smaps parsing helpers globally available
> > 
> > The is_vma_range_fmt and parse_vmflags will be required for detection of
> > availability of PR_SET_THP_DISABLE prctl
> > 
> > Acked-by: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
> > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Andrei Vagin <avagin@xxxxxxxxxxxxx>
> > 
> > 
> > That's at 
> > https://github.com/checkpoint-restore/criu/commit/9f61de87768b33d9b187ef71a57262a17276ff3b.
> > 
> > They are using it for the *exact* same reason that we are.  That commit 
> > was merged June 2017 based on the kernel code as it existed prior to the 
> > broken commit in July 2017.
> > 
> > 
> > Does this not suggest that CRIU is now equally as broken as my userspace?
> 
> I think it *was* broken in the same way as your userspace. And it was
> adjusted to the updated kernel by 5480a2a40683.
> 

Ah, ok, I see this in criu:

commit 5480a2a4068349c911fd78b9a45850b4b8a5596e
Author: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
Date:   Wed Jun 28 09:43:25 2017 +0300

    kerndat: add test for availability of PR_SET_THP_DISABLE prctl
    
    The PR_SET_THP_DISABLE prctl allows control of transparent huge pages on
    per-process basis. It is available since Linux 3.15, but until recently it
    set VM_NOHUGEPAGE for all VMAs created after prctl() call, which prevents
    proper restore for combination of pre- and post-copy. A recent change to
    prctl(PR_SET_THP_DISABLE) behavior eliminates the use of per-VMA flags and
    we can use the new version of the prctl() to disable THP.
    
    Acked-by: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
    Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
    Signed-off-by: Andrei Vagin <avagin@xxxxxxxxxxxxx>

So commit 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") 
broke our usecase as well as criu.  I'm not sure of the implications of 
the fix above, it appears that it tests if PR_SET_THP_DISABLE is available 
and whether or not we can parse VmFlags from smaps to detect that.

I assume that criu still works on previous kernels such that if this patch 
were merged that it wouldn't be broken.

So we have a case where multiple users have been broken by the commit, is 
there an objection to merging this so that we always get "nh" in VmFlags 
when PR_SET_THP_DISABLE is used since this was the only way it was 
detectable on earlier kernels?



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux