Re: [mm-unstable v7 13/18] proc/smaps: add PMDMappable field to smaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 12 10:27, Yang Shi wrote:
> On Tue, Jul 12, 2022 at 9:31 AM Zach O'Keefe <zokeefe@xxxxxxxxxx> wrote:
> >
> > On Jul 11 14:37, Yang Shi wrote:
> > > On Wed, Jul 6, 2022 at 5:06 PM Zach O'Keefe <zokeefe@xxxxxxxxxx> wrote:
> > > >
> > > > Add PMDMappable field to smaps output which informs the user if memory
> > > > in the VMA can be PMD-mapped by MADV_COLLAPSE.
> > > >
> > > > The distinction from THPeligible is needed for two reasons:
> > > >
> > > > 1) For THP, MADV_COLLAPSE is not coupled to THP sysfs controls, which
> > > >    THPeligible reports.
> > > >
> > > > 2) PMDMappable can also be used in HugeTLB fine-granularity mappings,
> > > >    which are independent from THP.
> > >
> > > Could you please elaborate the usecase? The user checks this hint
> > > before calling MADV_COLLAPSE? Is it really necessary?
> > >
> > > And, TBH it sounds confusing and we don't have to maintain both
> > > THPeligible and PMDMappable. We could just relax THPeligible to make
> > > it return 1 even though THP is disabled by sysfs but MADV_COLLAPSE
> > > could collapse it if such hint is useful.
> > >
> >
> > Hey Yang,
> >
> > Thanks for taking the time to review this series again, and thanks for
> > challenging this.
> >
> > TLDR: "Is it really necessary" - at the moment, no, probably not .. but I think
> > it's "useful".
> >
> > Rationale:
> >
> > 1. IMO, I thought was was confusing seeing:
> >
> >         ...
> >         AnonHugePages:      2048 kB
> >         ShmemPmdMapped:        0 kB
> >         FilePmdMapped:         0 kB
> >         Shared_Hugetlb:        0 kB
> >         Private_Hugetlb:       0 kB
> >         Swap:                  0 kB
> >         SwapPss:               0 kB
> >         Locked:                0 kB
> >         THPeligible:    0
> >         ...
> >
> > Maybe this could simply be clarified in the docs though.  I guess we can already
> > get:
> >
> >         ...
> >         AnonHugePages:         0 kB
> >         ShmemPmdMapped:        0 kB
> >         FilePmdMapped:      2048 kB
> >         Shared_Hugetlb:        0 kB
> >         Private_Hugetlb:       0 kB
> >         Swap:                  0 kB
> >         SwapPss:               0 kB
> >         Locked:                0 kB
> >         THPeligible:    0
> >         ...
> >
> > today[1], so perhaps it's not a big deal.
> 
> Not only that, if you have file PMD mapped then turn the THP sysfs
> flag off, you get the same result. It is just a hint and just shows
> the status at that moment when reading smaps.
>

Very good point.

> >
> >
> > 2. It was useful for debugging - similar to rationale for including
> > THPeligible1[2], the logic for determining if a VMA is eligible is pretty
> > complicated. I.e. is this file mapped suitably? Unlike THPeligible, however,
> > madvise(2) has the ability to set errno on failure to help* diagnose why some
> > memory isn't being backed.
> 
> I don't disagree it would help for debugging. But as a user who
> doesn't know too much about kernel internals, when I see THPeligible
> and PMDmappable, I would get confused TBH. And do we have to maintain
> another similar hint? Maybe not.
>

> >
> > 3. For the immediately-envisioned usecases, the user "knows" about what memory
> > they are acting on. However, eventually we'd like to experiment with moving THP
> > utilization policy to userspace. Here, it would be useful if the userspace agent
> > managing was made aware of what memory it should be managing. I don't have a
> > working prototype of what this would like yet, however.
> 
> It is not a strong justification to add some user visible stuff for a
> future feature (not even prototyped) since things may change, it
> sounds safer to add such things once the usecase is solid TBH.
>

Ya, this was a weaker point for inclusion *now* TBH.

> >
> > 4. I thought it was neat that this field could be reused for HugeTLB
> > fine-granularity mappings - but TBH I'm not sure how useful it'd be there.
> >
> > I figured relaxing existing THPeligible could break existing users / tests, and
> > it'd be likewise confusing for them to see THPeligible: 1, but then have faults
> > fail and they'd then have to go check sysfs settings and vma flags ; we'd be
> > back in pre-commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
> > vma").
> 
> I'm not sure what applications rely on this hint, but if they are just
> some test scripts, I think it should be fine. I don't think we
> guarantee the test scripts won't get broken. AFAIK some test scripts
> rely on the kernel dmesg text, for example, OOMs. And the meaning of
> the fields do change, for example, inactive anon of /proc/meminfo,
> which was changed by the patchset which put anon pages on inactive
> list first instead of active list. We already noticed the abnormal
> value from our monitoring tool when we adopted 5.10+ kernel. And
> /proc/vmstat also had some fields renamed, for example,
> workingset_refault of /proc/vmstat, it was split to
> workseting_refault_anon and workingset_refault_file, so we had to
> update our monitoring scripts accordingly. I think /proc/meminfo and
> /proc/vmstat are more heavily used than smaps.
>

Thanks for the great context. My guess is, right now, THPelligible is more
useful as-is than if we were to relax it to MADV_COLLAPSE eligibility. As such,
I'm fine dropping this until a stronger and more immediate usecase presents
itself. Thanks for checking my rationale here.

Best,
Zach


> >
> > Thanks,
> > Zach
> >
> > [1] https://lore.kernel.org/linux-mm/YrxbQGiwml24APCx@xxxxxxxxxx/
> >
> >
> > >
> > > >
> > > > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> > > > ---
> > > >  Documentation/filesystems/proc.rst | 10 ++++++++--
> > > >  fs/proc/task_mmu.c                 |  2 ++
> > > >  2 files changed, 10 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > > > index 47e95dbc820d..f207903a57a5 100644
> > > > --- a/Documentation/filesystems/proc.rst
> > > > +++ b/Documentation/filesystems/proc.rst
> > > > @@ -466,6 +466,7 @@ Memory Area, or VMA) there is a series of lines such as the following::
> > > >      MMUPageSize:           4 kB
> > > >      Locked:                0 kB
> > > >      THPeligible:           0
> > > > +    PMDMappable:           0
> > > >      VmFlags: rd ex mr mw me dw
> > > >
> > > >  The first of these lines shows the same information as is displayed for the
> > > > @@ -518,9 +519,14 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
> > > >  does not take into account swapped out page of underlying shmem objects.
> > > >  "Locked" indicates whether the mapping is locked in memory or not.
> > > >
> > > > +"PMDMappable" indicates if the memory can be mapped by PMDs - 1 if true, 0
> > > > +otherwise.  It just shows the current status. Note that this is memory
> > > > +operable on explicitly by MADV_COLLAPSE.
> > > > +
> > > >  "THPeligible" indicates whether the mapping is eligible for allocating THP
> > > > -pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise.
> > > > -It just shows the current status.
> > > > +pages by the kernel, as well as the THP is PMD mappable or not - 1 if true, 0
> > > > +otherwise. It just shows the current status.  Note this is memory the kernel can
> > > > +transparently provide as THPs.
> > > >
> > > >  "VmFlags" field deserves a separate description. This member represents the
> > > >  kernel flags associated with the particular virtual memory area in two letter
> > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > > > index f8cd58846a28..29f2089456ba 100644
> > > > --- a/fs/proc/task_mmu.c
> > > > +++ b/fs/proc/task_mmu.c
> > > > @@ -867,6 +867,8 @@ static int show_smap(struct seq_file *m, void *v)
> > > >
> > > >         seq_printf(m, "THPeligible:    %d\n",
> > > >                    hugepage_vma_check(vma, vma->vm_flags, true, false, true));
> > > > +       seq_printf(m, "PMDMappable:    %d\n",
> > > > +                  hugepage_vma_check(vma, vma->vm_flags, true, false, false));
> > > >
> > > >         if (arch_pkeys_enabled())
> > > >                 seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> > > > --
> > > > 2.37.0.rc0.161.g10f37bed90-goog
> > > >




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux