Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 4 Jun 2020 17:39:49 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote:

> On 04.06.20 17:23, SeongJae Park wrote:
> > On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote:
> > 
> >> On 04.06.20 09:26, SeongJae Park wrote:
> >>> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>>
> >>>> On 03.06.20 16:11, SeongJae Park wrote:
> >>>>> From: SeongJae Park <sjpark@xxxxxxxxx>
> >>>>>
> >>>>> This commit implements the four callbacks (->init_target_regions,
> >>>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> >>>>> for the basic access monitoring of the physical memory address space.
> >>>>> By setting the callback pointers to point those, users can easily
> >>>>> monitor the accesses to the physical memory.
> >>>>>
> >>>>> Internally, it uses the PTE Accessed bit, as similar to that of the
> >>>>> virtual memory support.  Also, it supports only page frames that
> >>>>> supported by idle page tracking.  Acutally, most of the code is stollen
> >>>>> from idle page tracking.  Users who want to use other access check
> >>>>> primitives and monitor the frames that not supported with this
> >>>>> implementation could implement their own callbacks on their own.
> >>>>>
> >>>>> Signed-off-by: SeongJae Park <sjpark@xxxxxxxxx>
> >>>>> ---
> >>>>>  include/linux/damon.h |   5 ++
> >>>>>  mm/damon.c            | 184 ++++++++++++++++++++++++++++++++++++++++++
> >>>>>  2 files changed, 189 insertions(+)
> >>>>>
> >>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
> >>>>> index 1a788bfd1b4e..f96503a532ea 100644
> >>>>> --- a/include/linux/damon.h
> >>>>> +++ b/include/linux/damon.h
> >>>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> >>>>>  void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> >>>>>  unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >>>>>  
> >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> >>>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> >>>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> >>>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> >>>>> +
> >>>>>  int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> >>>>>  int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> >>>>>  		unsigned long aggr_int, unsigned long regions_update_int,
> >>>>> diff --git a/mm/damon.c b/mm/damon.c
> >>>>> index f5cbc97a3bbc..6a5c6d540580 100644
> >>>>> --- a/mm/damon.c
> >>>>> +++ b/mm/damon.c
> >>>>> @@ -19,7 +19,9 @@
> >>>>>  #include <linux/mm.h>
> >>>>>  #include <linux/module.h>
> >>>>>  #include <linux/page_idle.h>
> >>>>> +#include <linux/pagemap.h>
> >>>>>  #include <linux/random.h>
> >>>>> +#include <linux/rmap.h>
> >>>>>  #include <linux/sched/mm.h>
> >>>>>  #include <linux/sched/task.h>
> >>>>>  #include <linux/slab.h>
> >>>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> >>>>>  	}
> >>>>>  }
> >>>>>  
> >>>>> +/* Do nothing.  Users should set the initial regions by themselves */
> >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> >>>>> +{
> >>>>> +}
> >>>>> +
> >>>>>  static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> >>>>>  {
> >>>>>  	pte_t *pte = NULL;
> >>>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> >>>>>  	return max_nr_accesses;
> >>>>>  }
> >>>>>  
> >>>>> +/* access check functions for physical address based regions */
> >>>>> +
> >>>>> +/* This code is stollen from page_idle.c */
> >>>>> +static struct page *damon_phys_get_page(unsigned long pfn)
> >>>>> +{
> >>>>> +	struct page *page;
> >>>>> +	pg_data_t *pgdat;
> >>>>> +
> >>>>> +	if (!pfn_valid(pfn))
> >>>>> +		return NULL;
> >>>>> +
> >>>>
> >>>> Who provides these pfns? Can these be random pfns, supplied unchecked by
> >>>> user space? Or are they at least mapped into some user space process?
> >>>
> >>> Your guess is right, users can give random physical address and that will be
> >>> translated into pfn.
> >>>
> >>
> >> Note the difference to idle tracking: "Idle page tracking only considers
> >> user memory pages", this is very different to your use case. Note that
> >> this is why there is no pfn_to_online_page() check in page idle code.
> > 
> > My use case is same to that of idle page.  I also ignore non-user pages.
> > Actually, this function is for filtering of the non-user pages, which is simply
> > stollen from the page_idle.
> 
> Okay, that is valuable information, I missed that. The comment in
> page_idle.c is actually pretty valuable.
> 
> In both cases, user space can provide random physical address but you
> will only care about user pages. Understood.
> 
> That turns things less dangerous. :)

Glad to hear this.  I will refine this point in the next spin! :)

> 
> >>>> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> >>>> initialized?
> >>>
> >>> Thank you for pointing out this!  I will use it in the next spin.  Also, this
> >>> code is stollen from page_idle_get_page().  Seems like it should also be
> >>> modified to use it.  I will send the patch for it, either.
> >>
> >> pfn_to_online_page() will only succeed for system RAM pages, not
> >> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
> >>
> >> I can spot that you are taking references to random struct pages. This
> >> looks dangerous to me and might mess in complicated ways with page
> >> migration/isolation/onlining/offlining etc. I am not sure if we want that.
> > 
> > AFAIU, page_idle users can also pass random pfns by randomly accessing the
> > bitmap file.  Am I missing something?
> 
> I am definitely no expert on page idle tracking. If that is the case,
> then we'll also need pfn_to_online_page() handling (and might have to
> care about ZONE_DEVICE, not hard but needs some extra LOCs).

Agree, I will post the patch soon.  That said, if you get any doubt, please
don't hesitate yelling.

> 
> I am still not sure if grabbing references on theoretically isolated
> pageblocks is okay, but that's just complicated stuff and as you state,
> is already performed. At least I can read "With such an indicator of
> user pages we can skip isolated pages". So isolated pages during page
> migration are properly handled.
> 
> 
> Instead of stealing, factor out, document, and reuse? That makes it
> clearer that you are not inventing the wheel, and if we have to fix
> something, we only have to fix at a single point.

Good point, I will consider reusing the code instead of stealing in the next
spin.


Thanks,
SeongJae Park

> 
> -- 
> Thanks,
> 
> David / dhildenb
> 



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux