Re: [PATCH] Hitshield : Something new eviction process for MGLRU

David Hildenbrand <david@xxxxxxxxxx> · Wed, 7 Aug 2024 13:05:14 +0200

On 07.08.24 04:10, 조민우 wrote:
2024년 8월 5일 (월) 오후 8:56, David Hildenbrand <david@xxxxxxxxxx 
<mailto:david@xxxxxxxxxx>>님이 작성:

    On 02.08.24 02:05, Minwoo Jo wrote:
     > Signed-off-by: Minwoo Jo <chminoo@xxxxxxxxxxxx
    <mailto:chminoo@xxxxxxxxxxxx>>
     >
     > This commit addresses the page space occupancy issue that arose
     > in the previous commit with the HitShield technology.
     >
     > The HitShield technology is based on the observation that MGLRU
     > does not take into account the state of the folio during eviction.
     >
     > I assumed that if a folio, which has been updated only 1 to 3 times,
     > has increased in generation until the point of eviction,
     > it is likely to be referenced less in the future.
     >
     > Therefore, I implemented this technology to protect frequently
    updated folios.
     >
     > I added the hitshield_0, hitshield_1, and hitshield flags to the
    page flag.
     >
     > Upon my review, I confirmed that the flags occupy positions 0 to 28.
     >
     > Thus, I believe that allocating 3 flags, or 3 bits, is sufficient.
     >
     > The hitshield_0 and hitshield_1 flags serve as count flags,
    representing the digit positions in binary.
     >
     > The hitshield flag is a boolean flag used to verify whether the
    HitShield is actually enabled.
     >
     > Each time a folio is added to lrugen, the hitshield_0 and
    hitshield_1 flags are cleared to reset them.
     >
     > Subsequently, in the folio_update_gen function, I added the
    following code:
     >
     > if (!folio_test_hitshield(folio))
     >      if (folio_test_change_hitshield_0(folio))
     >          if (folio_test_change_hitshield_1(folio))
     >              folio_set_hitshield(folio);
     >
     > This code counts the HitShield, and if it exceeds 5, it sets the
    hitshield flag.
     >
     > In the sort_folio function, which is executed for eviction, if
    the folio's hitshield flag is set,
     > it ages the folio to max_gen and resets the flag.
     >
     > The testing was conducted on Ubuntu 24.04 LTS (Kernel 6.8.0)
     > with an AMD Ryzen 9 5950X 16Core (32Threads) environment,
    restricting DRAM to 750MiB through Docker.
     >
     > In this environment, the ycsb benchmark was tested using the
    following command:
     >
     > ./bin/ycsb load mongodb -s -P workloads/workloadf -p
    recordcount=100000000 -p mongodb.batchsize=1024
     > -threads 32 -p mongodb.url="mongodb://localhost:27017/ycsb"
     >
     > During testing, a reduction of 78.9% in pswpin and 75.6% in
    pswpout was measured.
     >
     > Additionally, the runtime for the Load command decreased by 18.3%,
     > and the runtime for the Run command decreased by 9.3%.
     >
     > However, when running a single-threaded program with the current
    implementation,
     > while a decrease in swap counts was observed, the runtime
    increased compared to the native MGLRU.
     >
     > A thorough analysis is needed to understand why this occurs, but
    I think it appears
     > that there is significant overhead from the flag calculations.
     >
     > Therefore, it seems that if we activate this functionality
    through an ifdef statement
     > only in certain situations, we could achieve performance benefits.
     >
     > As an undergraduate student who is still self-studying the
    kernel, I am not very familiar with using ifdef,
     > so I did not write that part separately.
     >
     > I apologize for not being able to test it with large memory swap
    workloads,
     > as I was unsure what would be appropriate.
     >
     > I would appreciate your review and feedback on this approach.
     >
     > Thank you.
     > ---
     >   include/linux/mm_inline.h      |  4 ++++
     >   include/linux/page-flags.h     | 26 ++++++++++++++++++++++++++
     >   include/trace/events/mmflags.h |  5 ++++-
     >   mm/vmscan.c                    | 22 ++++++++++++++++++++++
     >   4 files changed, 56 insertions(+), 1 deletion(-)
     >
     > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
     > index f4fe593c1400..dea613b2785c 100644
     > --- a/include/linux/mm_inline.h
     > +++ b/include/linux/mm_inline.h
     > @@ -266,6 +266,10 @@ static inline bool lru_gen_add_folio(struct
    lruvec *lruvec, struct folio *folio,
     >       else
     >               list_add(&folio->lru,
    &lrugen->folios[gen][type][zone]);
     >
     > +     folio_clear_hitshield_0(folio);
     > +     folio_clear_hitshield_1(folio);
     > +     /* This for initialize hit_shield by 0 when folio add to gen */
     > +
     >       return true;
     >   }
     >
     > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
     > index 5769fe6e4950..70951c6fe4ce 100644
     > --- a/include/linux/page-flags.h
     > +++ b/include/linux/page-flags.h
     > @@ -130,6 +130,16 @@ enum pageflags {
     >       PG_arch_2,
     >       PG_arch_3,
     >   #endif
     > +     PG_hitshield_0,
     > +     PG_hitshield_1,
     > +     PG_hitshield,
     > +
     > +     /*
     > +      * The flags consist of two types: one for counting the
    update occurrences
     > +      * of the folio and another boolean flag to check whether
    the HitShield is activated.
     > +      */

    Note that adding new pageflags is frowned upon -- we're actually trying
    to get rid of some of them, like PG_error. Getting another 3 added
    seems
    very .. unlikely :)

    Especially when done unconditionally on 32bit as well this might just
    not work at all, and as you state, would require #ifdefery.

    -- 
    Cheers,

    David / dhildenb

I could not understand my intention to reduce the page flag due to my 
limited knowledge.

However, since this method is expected to improve performance when a 
swap occurs in
a conditionally multi-threaded environment, I believe it would not be a 
problem to merge
it in a conditional format like ifdef.

In my previous commit attempt, I tried adding a uchar variable to the 
page, but unfortunately,
this method was rejected due to the lack of space on the page.

At that time, the suggested approach utilized the fact that there are a 
few bits left in the page flags,
but I would appreciate any additional ideas you might have.

Note that we don't hand out page flags like free candy, even if we would 
still have some spare ones and it would improve performance for some 
workloads. :)

See include/linux/page-flags-layout.h, page_flags are not only used to 
store flags, but plenty of other stuff (some depending on the kernel 
cfg) that either cannot be stored elsewhere or would be very expensive 
to lookup otherwise.

Even on 64bit we will sooner or later run out of flags if we let every 
optimization use them. And we all know that reclaiming already used 
flags is rather a pain, and nobody is willing to give up optimizations 
once they landed.

... again, some people are actively working on reclaiming page flags 
(e.g., Willy currently with PG_error).

Adding 3 flags to optimize MGLRU -- after MGLRU already consumes 3 
additional ones -- is unlikely to happen.

It would be great if you could look into alternative approaches that 
don't require new page flags and/or other page metadata.

Once we managed to decouple "struct folio" from "struct page" it might 
get easier to add per-folio metadata, although we also don't want to 
grow that struct arbitrarily in size.

--
Cheers,

David / dhildenb