Re: + mm-thp-always-specify-disabled-vmas-as-nh-in-smaps.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The holidays are over and I think it's time to resurrect this thread.

On Fri, Dec 28, 2018 at 11:54:17AM +0100, Vlastimil Babka wrote:
> On 12/28/18 9:18 AM, Michal Hocko wrote:
> > On Thu 27-12-18 21:31:00, Andrew Morton wrote:
> >> On Thu, 27 Dec 2018 14:11:14 +0300 "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> wrote:
> >>
> >>> On Mon, Dec 24, 2018 at 10:17:31AM +0100, Michal Hocko wrote:
> >>>> On Mon 24-12-18 01:05:57, David Rientjes wrote:
> >>>> [...]
> >>>>> I'm not interested in having a 100 email thread about this when a clear 
> >>>>> and simple fix exists that actually doesn't break user code.
> >>>>
> >>>> You are breaking everybody who really wants to query MADV_NOHUGEPAGE
> >>>> status by this flag. Is there anybody doing that?
> >>>
> >>> Yes.
> >>>
> >>> https://github.com/checkpoint-restore/criu/blob/v3.11/criu/proc_parse.c#L143
> >>
> >> Ugh.  So the regression fix causes a regression?
> > 
> > Yes. The patch from David will hardcode the nohugepage vm flag if the
> > THP was disabled by the prctl at the time of the snapshot. And if the
> > application later enables THP by the prctl then existing mappings would
> > never get the THP enabled status back.
> > 
> > This is the kind of a potential regression I was poiting out earlier
> > when explaining that the patch encodes the logic into the flag exporting
> > and that means that whoever wants to get the raw value of the flag will
> > not be able to do so. Please note that the raw value is exactly what
> > this interface is documented and supposed to export. And as the
> > documentation explains it is implementation specific and anybody to use
> > it should be careful.
> 
> Let's add some CRIU guys in the loop (dunno if the right ones). We're
> discussing David's patch [1] that makes 'nh' and 'hg' flags reported in
> /proc/pid/smaps (and set by madvise) overridable by
> prctl(PR_SET_THP_DISABLE). This was sort of accidental behavior (but
> only for mappings created after the prctl call) before 4.13 commit
> 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active").
> 
> For David's userspace that commit is a regression as there are false
> positives when checking for vma's that are eligible for THP (=don't have
> the 'nh' flag in smaps) but didn't really obtain THP's. The userspace
> assumes it's due to fragmentation (=bad) and cannot know that it's due
> to the prctl(). But we fear that making prctl() affect smaps vma flags
> means that CRIU can't query them accurately anymore, and thus it's a
> regression for CRIU. Can you comment on that?
> Michal has a patch [2] that reports the prctl() status separately, but
> that doesn't help David's existing userspace. For CRIU this also won't
> help as long the smaps vma flags still silently included the prctl()
> status. Do you see some solution that would work for everybody?

The patch from David obviously breaks CRIU, and I can't see a nice solution
that will work for everybody.

Of course we could add something like 'NH' to /proc/pid/smaps so that 'nh'
will work as David's userspace is expecting and 'NH' will represent the
state of VmFlags. This is hackish and ugly, though.

In any case, if David's patch is not reverted CRIU needs some way to know
if VMA has VM_NOHUGEPAGE set.

> [1]
> https://www.ozlabs.org/~akpm/mmots/broken-out/mm-thp-always-specify-disabled-vmas-as-nh-in-smaps.patch
> [2]
> https://www.ozlabs.org/~akpm/mmots/broken-out/mm-proc-report-pr_set_thp_disable-in-proc.patch
> 

-- 
Sincerely yours,
Mike.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux