Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 12, 2021 at 1:41 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
> On Tue, Oct 12, 2021 at 11:52:42AM -0700, Suren Baghdasaryan wrote:
> > On Tue, Oct 12, 2021 at 11:26 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > >
> > > On Mon, Oct 11, 2021 at 10:36:24PM -0700, Suren Baghdasaryan wrote:
> > > > On Mon, Oct 11, 2021 at 8:00 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, Oct 11, 2021 at 06:20:25PM -0700, Suren Baghdasaryan wrote:
> > > > > > On Mon, Oct 11, 2021 at 6:18 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Mon, Oct 11, 2021 at 1:36 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Fri 08-10-21 13:58:01, Kees Cook wrote:
> > > > > > > > > - Strings for "anon" specifically have no required format (this is good)
> > > > > > > > >   it's informational like the task_struct::comm and can (roughly)
> > > > > > > > >   anything. There's no naming convention for memfds, AF_UNIX, etc. Why
> > > > > > > > >   is one needed here? That seems like a completely unreasonable
> > > > > > > > >   requirement.
> > > > > > > >
> > > > > > > > I might be misreading the justification for the feature. Patch 2 is
> > > > > > > > talking about tools that need to understand memeory usage to make
> > > > > > > > further actions. Also Suren was suggesting "numbering convetion" as an
> > > > > > > > argument against.
> > > > > > > >
> > > > > > > > So can we get a clear example how is this being used actually? If this
> > > > > > > > is just to be used to debug by humans than I can see an argument for
> > > > > > > > human readable form. If this is, however, meant to be used by tools to
> > > > > > > > make some actions then the argument for strings is much weaker.
> > > > > > >
> > > > > > > The simplest usecase is when we notice that a process consumes more
> > > > > > > memory than usual and we do "cat /proc/$(pidof my_process)/maps" to
> > > > > > > check which area is contributing to this growth. The names we assign
> > > > > > > to anonymous areas are descriptive enough for a developer to get an
> > > > > > > idea where the increased consumption is coming from and how to proceed
> > > > > > > with their investigation.
> > > > > > > There are of course cases when tools are involved, but the end-user is
> > > > > > > always a human and the final report should contain easily
> > > > > > > understandable data.
> > > > > > >
> > > > > > > IIUC, the main argument here is whether the userspace can provide
> > > > > > > tools to perform the translations between ids and names, with the
> > > > > > > kernel accepting and reporting ids instead of strings. Technically
> > > > > > > it's possible, but to be practical that conversion should be fast
> > > > > > > because we will need to make name->id conversion potentially for each
> > > > > > > mmap. On the consumer side the performance is not as critical, but the
> > > > > > > fact that instead of dumping /proc/$pid/maps we will have to parse the
> > > > > > > file, do id->name conversion and replace all [anon:id] with
> > > > > > > [anon:name] would be an issue when we do that in bulk, for example
> > > > > > > when collecting system-wide data for a bugreport.
> > > > >
> > > > > Is that something you need to do client-side? Or could the bug tool
> > > > > upload the userspace-maintained name:ids database alongside the
> > > > > /proc/pid/maps dump for external processing?
> > > >
> > > > You can generate a bugreport and analyze it locally or submit it as an
> > > > attachment to a bug for further analyzes.
> > > > Sure, we can attach the id->name conversion table to the bugreport but
> > > > either way, some tool would have to post-process it to resolve the
> > > > ids. If we are not analyzing the results immediately then that step
> > > > can be postponed and I think that's what you mean? If so, then yes,
> > > > that is correct.
> > >
> > > Right, somebody needs to do it at some point, but I suppose it's less
> > > of a problem if a developer machine does it than a mobile device.
> >
> > True, and that's why I mentioned that it's not as critical as the
> > efficiency at mmap() time. In any case, if we could avoid translations
> > at all that would be ideal.
> >
> > >
> > > One advantage of an ID over a string - besides not having to maintain
> > > a deduplicating arbitrary string storage in the kernel - is that we
> > > may be able to auto-assign unique IDs to VMAs in the kernel, in a way
> > > that we could not with strings. You'd still have to do IPC calls to
> > > write new name mappings into your db, but you wouldn't have to do the
> > > prctl() to assign stuff in the kernel at all.
> >
> > You still have to retrieve that tag from the kernel to record it in
> > your db, so this would still require some syscall, no?
>
> Don't you have to do this with the string setting interface as well?
> How do you know the vma address to pass into the prctl()? Is this
> somehow coordinated with the mmap()?

Sure. The sequence is:

ptr = mmap(NULL, size, ...);
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ptr, size, name);

>
> > > (We'd have to think of a solution of how IDs work with vma merging and
> > > splitting, but I think to a certain degree that's policy and we should
> > > be able to find something workable - a MAP_ID flag, using anon_vma as
> > > identity, assigning IDs at mmap time and do merges only for protection
> > > changes etc. etc.)
> >
> > Overall, I think keeping the kernel out of this and letting it treat
> > this tag as a cookie which only userspace cares about is simpler.
> > Unless you see other uses where kernel's involvement is needed.
>
> It depends on what you consider keeping the kernel out of it. A small
> extension to assign unique IDs to mappings automatically in an
> intuitive way (with a compat option to disable) is a much smaller ABI
> commitment than a prctl()-controlled string storage.

I'm not saying it's hard or complex. I just don't see the advantage of
generating these IDs in the kernel vs passing them from userspace.
Maybe I'm missing some usecase?

> When I say policy on how to assign the ID, I didn't mean that it
> should be a free for all. Rather that we should pick one reasonable
> way to do it, comparable to picking the parameters for how long the
> stored strings could be, which characters to allow etc.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux