On Thu, 4 Jun 2020 17:39:49 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote: > On 04.06.20 17:23, SeongJae Park wrote: > > On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote: > > > >> On 04.06.20 09:26, SeongJae Park wrote: > >>> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote: > >>> > >>>> On 03.06.20 16:11, SeongJae Park wrote: > >>>>> From: SeongJae Park <sjpark@xxxxxxxxx> > >>>>> > >>>>> This commit implements the four callbacks (->init_target_regions, > >>>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses) > >>>>> for the basic access monitoring of the physical memory address space. > >>>>> By setting the callback pointers to point those, users can easily > >>>>> monitor the accesses to the physical memory. > >>>>> > >>>>> Internally, it uses the PTE Accessed bit, as similar to that of the > >>>>> virtual memory support. Also, it supports only page frames that > >>>>> supported by idle page tracking. Acutally, most of the code is stollen > >>>>> from idle page tracking. Users who want to use other access check > >>>>> primitives and monitor the frames that not supported with this > >>>>> implementation could implement their own callbacks on their own. > >>>>> > >>>>> Signed-off-by: SeongJae Park <sjpark@xxxxxxxxx> > >>>>> --- > >>>>> include/linux/damon.h | 5 ++ > >>>>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++ > >>>>> 2 files changed, 189 insertions(+) > >>>>> > >>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h > >>>>> index 1a788bfd1b4e..f96503a532ea 100644 > >>>>> --- a/include/linux/damon.h > >>>>> +++ b/include/linux/damon.h > >>>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx); > >>>>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx); > >>>>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx); > >>>>> > >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx); > >>>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx); > >>>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx); > >>>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx); > >>>>> + > >>>>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids); > >>>>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, > >>>>> unsigned long aggr_int, unsigned long regions_update_int, > >>>>> diff --git a/mm/damon.c b/mm/damon.c > >>>>> index f5cbc97a3bbc..6a5c6d540580 100644 > >>>>> --- a/mm/damon.c > >>>>> +++ b/mm/damon.c > >>>>> @@ -19,7 +19,9 @@ > >>>>> #include <linux/mm.h> > >>>>> #include <linux/module.h> > >>>>> #include <linux/page_idle.h> > >>>>> +#include <linux/pagemap.h> > >>>>> #include <linux/random.h> > >>>>> +#include <linux/rmap.h> > >>>>> #include <linux/sched/mm.h> > >>>>> #include <linux/sched/task.h> > >>>>> #include <linux/slab.h> > >>>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx) > >>>>> } > >>>>> } > >>>>> > >>>>> +/* Do nothing. Users should set the initial regions by themselves */ > >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx) > >>>>> +{ > >>>>> +} > >>>>> + > >>>>> static void damon_mkold(struct mm_struct *mm, unsigned long addr) > >>>>> { > >>>>> pte_t *pte = NULL; > >>>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx) > >>>>> return max_nr_accesses; > >>>>> } > >>>>> > >>>>> +/* access check functions for physical address based regions */ > >>>>> + > >>>>> +/* This code is stollen from page_idle.c */ > >>>>> +static struct page *damon_phys_get_page(unsigned long pfn) > >>>>> +{ > >>>>> + struct page *page; > >>>>> + pg_data_t *pgdat; > >>>>> + > >>>>> + if (!pfn_valid(pfn)) > >>>>> + return NULL; > >>>>> + > >>>> > >>>> Who provides these pfns? Can these be random pfns, supplied unchecked by > >>>> user space? Or are they at least mapped into some user space process? > >>> > >>> Your guess is right, users can give random physical address and that will be > >>> translated into pfn. > >>> > >> > >> Note the difference to idle tracking: "Idle page tracking only considers > >> user memory pages", this is very different to your use case. Note that > >> this is why there is no pfn_to_online_page() check in page idle code. > > > > My use case is same to that of idle page. I also ignore non-user pages. > > Actually, this function is for filtering of the non-user pages, which is simply > > stollen from the page_idle. > > Okay, that is valuable information, I missed that. The comment in > page_idle.c is actually pretty valuable. > > In both cases, user space can provide random physical address but you > will only care about user pages. Understood. > > That turns things less dangerous. :) Glad to hear this. I will refine this point in the next spin! :) > > >>>> IOW, do we need a pfn_to_online_page() to make sure the memmap even was > >>>> initialized? > >>> > >>> Thank you for pointing out this! I will use it in the next spin. Also, this > >>> code is stollen from page_idle_get_page(). Seems like it should also be > >>> modified to use it. I will send the patch for it, either. > >> > >> pfn_to_online_page() will only succeed for system RAM pages, not > >> dax/pmem (ZONE_DEVICE). dax/pmem needs special care. > >> > >> I can spot that you are taking references to random struct pages. This > >> looks dangerous to me and might mess in complicated ways with page > >> migration/isolation/onlining/offlining etc. I am not sure if we want that. > > > > AFAIU, page_idle users can also pass random pfns by randomly accessing the > > bitmap file. Am I missing something? > > I am definitely no expert on page idle tracking. If that is the case, > then we'll also need pfn_to_online_page() handling (and might have to > care about ZONE_DEVICE, not hard but needs some extra LOCs). Agree, I will post the patch soon. That said, if you get any doubt, please don't hesitate yelling. > > I am still not sure if grabbing references on theoretically isolated > pageblocks is okay, but that's just complicated stuff and as you state, > is already performed. At least I can read "With such an indicator of > user pages we can skip isolated pages". So isolated pages during page > migration are properly handled. > > > Instead of stealing, factor out, document, and reuse? That makes it > clearer that you are not inventing the wheel, and if we have to fix > something, we only have to fix at a single point. Good point, I will consider reusing the code instead of stealing in the next spin. Thanks, SeongJae Park > > -- > Thanks, > > David / dhildenb >