RE: [PATCH v2 0/3] mm/damon: Profiling enhancements for DAMON

"Prasad, Aravinda" <aravinda.prasad@xxxxxxxxx> · Tue, 19 Mar 2024 10:56:42 +0000

> -----Original Message-----
> From: SeongJae Park <sj@xxxxxxxxxx>
> Sent: Tuesday, March 19, 2024 10:51 AM
> To: Prasad, Aravinda <aravinda.prasad@xxxxxxxxx>
> Cc: damon@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; sj@xxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; s2322819@xxxxxxxx; Kumar, Sandeep4
> <sandeep4.kumar@xxxxxxxxx>; Huang, Ying <ying.huang@xxxxxxxxx>;
> Hansen, Dave <dave.hansen@xxxxxxxxx>; Williams, Dan J
> <dan.j.williams@xxxxxxxxx>; Subramoney, Sreenivas
> <sreenivas.subramoney@xxxxxxxxx>; Kervinen, Antti
> <antti.kervinen@xxxxxxxxx>; Kanevskiy, Alexander
> <alexander.kanevskiy@xxxxxxxxx>
> Subject: Re: [PATCH v2 0/3] mm/damon: Profiling enhancements for DAMON
> 
> Hi Aravinda,
> 
> 
> Thank you for posting this new revision!
> 
> I remember I told you that I don't see a high level significant problems on on
> the reply to the previous revision of this patch[1], but I show a concern now.
> Sorry for not raising this earlier, but let me explain my humble concerns before
> being even more late.

Sure, no problem. We can discuss. I will get back to you with a detailed note.

Regards,
Aravinda

> 
> On Mon, 18 Mar 2024 18:58:45 +0530 Aravinda Prasad
> <aravinda.prasad@xxxxxxxxx> wrote:
> 
> > DAMON randomly samples one or more pages in every region and tracks
> > accesses to them using the ACCESSED bit in PTE (or PMD for 2MB pages).
> > When the region size is large (e.g., several GBs), which is common for
> > large footprint applications, detecting whether the region is accessed
> > or not completely depends on whether the pages that are actively
> > accessed in the region are picked during random sampling.
> > If such pages are not picked for sampling, DAMON fails to identify the
> > region as accessed. However, increasing the sampling rate or
> > increasing the number of regions increases CPU overheads of kdamond.
> 
> DAMON uses sampling because it considers a region as accessed if a portion of
> the region that big enough to be detected via sampling is all accessed.  If a
> region is having some pages that really accessed but the proportion is too
> small to be found via sampling, I think DAMON could say the overall access to
> the region is only modest and could even be ignored.  In my humble opinion,
> this fits with the definition of DAMON region: A memory address range that
> constructed with pages having similar access frequency.

> 
> >
> > This patch proposes profiling different levels of the
> > application\u2019s page table tree to detect whether a region is
> > accessed or not. This patch set is based on the observation that, when
> > the accessed bit for a page is set, the accessed bits at the higher
> > levels of the page table tree (PMD/PUD/PGD) corresponding to the path
> > of the page table walk are also set. Hence, it is efficient to check
> > the accessed bits at the higher levels of the page table tree to
> > detect whether a region is accessed or not. For example, if the access
> > bit for a PUD entry is set, then one or more pages in the 1GB PUD
> > subtree is accessed as each PUD entry covers 1GB mapping. Hence,
> > instead of sampling thousands of 4K/2M pages to detect accesses in a
> > large region, sampling at the higher level of page table tree is faster and
> efficient.
> 
> Due to the above reason, I concern this could result in making DAMON
> monitoring results be inaccurately biased to report more than real accesses.
> 
> >
> > This patch set is based on 6.8-rc5 kernel (commit: f48159f8,
> > mm-unstable
> > tree)
> >
> > Changes since v1 [1]
> > ====================
> >
> >  - Added support for 5-level page table tree
> >  - Split the patch to mm infrastructure changes and DAMON enhancements
> >  - Code changes as per comments on v1
> >  - Added kerneldoc comments
> >
> > [1] https://lkml.org/lkml/2023/12/15/272
> >
> > Evaluation:
> >
> > - MASIM benchmark with 1GB, 10GB, 100GB footprint with 10% hot data
> >   and 5TB with 10GB hot data.
> > - DAMON: 5ms sampling, 200ms aggregation interval. Rest all
> >   parameters set to default value.
> > - DAMON+PTP: Page table profiling applied to DAMON with the above
> >   parameters.
> >
> > Profiling efficiency in detecting hot data:
> >
> > Footprint	1GB	10GB	100GB	5TB
> > ---------------------------------------------
> > DAMON		>90%	<50%	 ~0%	  0%
> > DAMON+PTP	>90%	>90%	>90%	>90%
> 
> Sampling interval is the time interval that assumed to be large enough for the
> workload to make meaningful amount of accesses within the interval.  Hence,
> meaningful amount of sampling interval depends on the workload's
> characteristic and system's memory bandwidth.
> 
> Here, the size of the hot memory region is about 100MB, 1GB, 10GB, and
> 10GB for the four cases, respectively.  And you set the sampling interval as
> 5ms.  Let's assume the system can access, say, 50 GB per second, and hence it
> could be able to access only up to 250 MB per 5ms.  So, in case of 1GB and
> footprint, all hot memory region would be accessed while DAMON is waiting
> for next sampling interval.  Hence, DAMON would be able to see most
> accesses via sampling.  But for 100GB footprint case, only 250MB / 10GB =
> about 2.5% of the hot memory region would be accessed between the
> sampling interval.  DAMON cannot see whole accesses, and hence the
> precision could be low.
> 
> I don't know exact memory bandwith of the system, but to detect the 10 GB
> hot region with 5ms sampling interval, the system should be able to access
> 2GB memory per millisecond, or about 2TB memory per second.  I think
> systems of such memory bandwidth is not that common.
> 
> I show you also explored a configuration setting the aggregation interval
> higher.  But because each sampling checks only access between the sampling
> interval, that might not help in this setup.  I'm wondering if you also explored
> increasing sampling interval.
> 
> Sorry again for finding this concern not early enough.  But I think we may need
> to discuss about this first.
> 
> [1] https://lkml.kernel.org/r/20231215201159.73845-1-sj@xxxxxxxxxx
> 
> 
> Thanks,
> SJ
> 
> 
> >
> > CPU overheads (in billion cycles) for kdamond:
> >
> > Footprint	1GB	10GB	100GB	5TB
> > ---------------------------------------------
> > DAMON		1.15	19.53	3.52	9.55
> > DAMON+PTP	0.83	 3.20	1.27	2.55
> >
> > A detailed explanation and evaluation can be found in the arXiv paper:
> > https://arxiv.org/pdf/2311.10275.pdf
> >
> >
> > Aravinda Prasad (3):
> >   mm/damon: mm infrastructure support
> >   mm/damon: profiling enhancement
> >   mm/damon: documentation updates
> >
> >  Documentation/mm/damon/design.rst |  42 ++++++
> >  arch/x86/include/asm/pgtable.h    |  20 +++
> >  arch/x86/mm/pgtable.c             |  28 +++-
> >  include/linux/mmu_notifier.h      |  36 +++++
> >  include/linux/pgtable.h           |  79 ++++++++++
> >  mm/damon/vaddr.c                  | 233 ++++++++++++++++++++++++++++--
> >  6 files changed, 424 insertions(+), 14 deletions(-)
> >
> > --
> > 2.21.3