Re: [PATCH 3/3] mm/maps: read proc/pid/maps under RCU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 22, 2024 at 10:07 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Mon, Jan 22, 2024 at 9:36 PM SeongJae Park <sj@xxxxxxxxxx> wrote:
> >
> > Hi Suren,
> >
> > On Sun, 21 Jan 2024 23:13:24 -0800 Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> >
> > > With maple_tree supporting vma tree traversal under RCU and per-vma locks
> > > making vma access RCU-safe, /proc/pid/maps can be read under RCU and
> > > without the need to read-lock mmap_lock. However vma content can change
> > > from under us, therefore we make a copy of the vma and we pin pointer
> > > fields used when generating the output (currently only vm_file and
> > > anon_name). Afterwards we check for concurrent address space
> > > modifications, wait for them to end and retry. That last check is needed
> > > to avoid possibility of missing a vma during concurrent maple_tree
> > > node replacement, which might report a NULL when a vma is replaced
> > > with another one. While we take the mmap_lock for reading during such
> > > contention, we do that momentarily only to record new mm_wr_seq counter.
> > > This change is designed to reduce mmap_lock contention and prevent a
> > > process reading /proc/pid/maps files (often a low priority task, such as
> > > monitoring/data collection services) from blocking address space updates.
> > >
> > > Note that this change has a userspace visible disadvantage: it allows for
> > > sub-page data tearing as opposed to the previous mechanism where data
> > > tearing could happen only between pages of generated output data.
> > > Since current userspace considers data tearing between pages to be
> > > acceptable, we assume is will be able to handle sub-page data tearing
> > > as well.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> > > ---
> > >  fs/proc/internal.h |   2 +
> > >  fs/proc/task_mmu.c | 114 ++++++++++++++++++++++++++++++++++++++++++---
> > >  2 files changed, 109 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > > index a71ac5379584..e0247225bb68 100644
> > > --- a/fs/proc/internal.h
> > > +++ b/fs/proc/internal.h
> > > @@ -290,6 +290,8 @@ struct proc_maps_private {
> > >       struct task_struct *task;
> > >       struct mm_struct *mm;
> > >       struct vma_iterator iter;
> > > +     unsigned long mm_wr_seq;
> > > +     struct vm_area_struct vma_copy;
> > >  #ifdef CONFIG_NUMA
> > >       struct mempolicy *task_mempolicy;
> > >  #endif
> > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > > index 3f78ebbb795f..3886d04afc01 100644
> > > --- a/fs/proc/task_mmu.c
> > > +++ b/fs/proc/task_mmu.c
> > > @@ -126,11 +126,96 @@ static void release_task_mempolicy(struct proc_maps_private *priv)
> > >  }
> > >  #endif
> > >
> > > -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv,
> > > -                                             loff_t *ppos)
> > > +#ifdef CONFIG_PER_VMA_LOCK
> > > +
> > > +static const struct seq_operations proc_pid_maps_op;
> > > +/*
> > > + * Take VMA snapshot and pin vm_file and anon_name as they are used by
> > > + * show_map_vma.
> > > + */
> > > +static int get_vma_snapshow(struct proc_maps_private *priv, struct vm_area_struct *vma)
> > >  {
> > > +     struct vm_area_struct *copy = &priv->vma_copy;
> > > +     int ret = -EAGAIN;
> > > +
> > > +     memcpy(copy, vma, sizeof(*vma));
> > > +     if (copy->vm_file && !get_file_rcu(&copy->vm_file))
> > > +             goto out;
> > > +
> > > +     if (copy->anon_name && !anon_vma_name_get_rcu(copy))
> > > +             goto put_file;
> >
> > From today updated mm-unstable which containing this patch, I'm getting below
> > build error when CONFIG_ANON_VMA_NAME is not set.  Seems this patch needs to
> > handle the case?
>
> Hi SeongJae,
> Thanks for reporting! I'll post an updated version fixing this config.

Fix is posted at
https://lore.kernel.org/all/20240123231014.3801041-3-surenb@xxxxxxxxxx/
as part of v2 of this patchset.
Thanks,
Suren.

> Suren.
>
>
> >
> >     .../linux/fs/proc/task_mmu.c: In function ‘get_vma_snapshow’:
> >     .../linux/fs/proc/task_mmu.c:145:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
> >       145 |         if (copy->anon_name && !anon_vma_name_get_rcu(copy))
> >           |                   ^~~~~~~~~
> >           |                   anon_vma
> >     .../linux/fs/proc/task_mmu.c:161:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
> >       161 |         if (copy->anon_name)
> >           |                   ^~~~~~~~~
> >           |                   anon_vma
> >     .../linux/fs/proc/task_mmu.c:162:41: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
> >       162 |                 anon_vma_name_put(copy->anon_name);
> >           |                                         ^~~~~~~~~
> >           |                                         anon_vma
> >     .../linux/fs/proc/task_mmu.c: In function ‘put_vma_snapshot’:
> >     .../linux/fs/proc/task_mmu.c:174:18: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
> >       174 |         if (vma->anon_name)
> >           |                  ^~~~~~~~~
> >           |                  anon_vma
> >     .../linux/fs/proc/task_mmu.c:175:40: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
> >       175 |                 anon_vma_name_put(vma->anon_name);
> >           |                                        ^~~~~~~~~
> >           |                                        anon_vma
> >
> > [...]
> >
> >
> > Thanks,
> > SJ





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux