On 12/28/18 9:18 AM, Michal Hocko wrote: > On Thu 27-12-18 21:31:00, Andrew Morton wrote: >> On Thu, 27 Dec 2018 14:11:14 +0300 "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> wrote: >> >>> On Mon, Dec 24, 2018 at 10:17:31AM +0100, Michal Hocko wrote: >>>> On Mon 24-12-18 01:05:57, David Rientjes wrote: >>>> [...] >>>>> I'm not interested in having a 100 email thread about this when a clear >>>>> and simple fix exists that actually doesn't break user code. >>>> >>>> You are breaking everybody who really wants to query MADV_NOHUGEPAGE >>>> status by this flag. Is there anybody doing that? >>> >>> Yes. >>> >>> https://github.com/checkpoint-restore/criu/blob/v3.11/criu/proc_parse.c#L143 >> >> Ugh. So the regression fix causes a regression? > > Yes. The patch from David will hardcode the nohugepage vm flag if the > THP was disabled by the prctl at the time of the snapshot. And if the > application later enables THP by the prctl then existing mappings would > never get the THP enabled status back. > > This is the kind of a potential regression I was poiting out earlier > when explaining that the patch encodes the logic into the flag exporting > and that means that whoever wants to get the raw value of the flag will > not be able to do so. Please note that the raw value is exactly what > this interface is documented and supposed to export. And as the > documentation explains it is implementation specific and anybody to use > it should be careful. Let's add some CRIU guys in the loop (dunno if the right ones). We're discussing David's patch [1] that makes 'nh' and 'hg' flags reported in /proc/pid/smaps (and set by madvise) overridable by prctl(PR_SET_THP_DISABLE). This was sort of accidental behavior (but only for mappings created after the prctl call) before 4.13 commit 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active"). For David's userspace that commit is a regression as there are false positives when checking for vma's that are eligible for THP (=don't have the 'nh' flag in smaps) but didn't really obtain THP's. The userspace assumes it's due to fragmentation (=bad) and cannot know that it's due to the prctl(). But we fear that making prctl() affect smaps vma flags means that CRIU can't query them accurately anymore, and thus it's a regression for CRIU. Can you comment on that? Michal has a patch [2] that reports the prctl() status separately, but that doesn't help David's existing userspace. For CRIU this also won't help as long the smaps vma flags still silently included the prctl() status. Do you see some solution that would work for everybody? [1] https://www.ozlabs.org/~akpm/mmots/broken-out/mm-thp-always-specify-disabled-vmas-as-nh-in-smaps.patch [2] https://www.ozlabs.org/~akpm/mmots/broken-out/mm-proc-report-pr_set_thp_disable-in-proc.patch