Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes: > On Tue, 12 Sep 2023 10:52:25 -0700 Stefan Roesch <shr@xxxxxxxxxxxx> wrote: > >> This change adds a "smart" page scanning mode for KSM. So far all the >> candidate pages are continuously scanned to find candidates for >> de-duplication. There are a considerably number of pages that cannot be >> de-duplicated. This is costly in terms of CPU. By using smart scanning >> considerable CPU savings can be achieved. >> >> This change takes the history of scanning pages into account and skips >> the page scanning of certain pages for a while if de-deduplication for >> this page has not been successful in the past. >> >> To do this it introduces two new fields in the ksm_rmap_item structure: >> age and skip_age. age, is the KSM age and skip_page is the age for how > > s/skip_page/skip_age/ > Fixed in the next version. >> long page scanning of this page is skipped. The age field is incremented >> each time the page is scanned and the page cannot be de-duplicated. >> >> How often a page is skipped is dependent how often de-duplication has >> been tried so far and the number of skips is currently limited to 8. >> This value has shown to be effective with different workloads. >> >> The feature is currently disable by default and can be enabled with the >> new smart_scan knob. >> >> The feature has shown to be very effective: upt to 25% of the page scans >> can be eliminated; the pages_to_scan rate can be reduced by 40 - 50% and >> a similar de-duplication rate can be maintained. >> > > All seems nice. I'll sit out v1, see what people have to say. > > Some nits: > >> --- a/mm/ksm.c >> +++ b/mm/ksm.c >> >> ... >> >> @@ -2305,6 +2314,45 @@ static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot, >> return rmap_item; >> } >> >> +static unsigned int inc_skip_age(rmap_age_t age) >> +{ >> + if (age <= 3) >> + return 1; >> + if (age <= 5) >> + return 2; >> + if (age <= 8) >> + return 4; >> + >> + return 8; >> +} > > "inc_skip_age" sounds like it increments something. Can we give it a > better name? > > And a nice comment explaining its role in life. > Renamed it to skip_age in the next version and added a comment. >> +static bool skip_rmap_item(struct page *page, struct ksm_rmap_item *rmap_item) >> +{ >> + rmap_age_t age; >> + >> + if (!ksm_smart_scan) >> + return false; >> + >> + if (PageKsm(page)) >> + return false; >> + >> + age = rmap_item->age++; >> + if (age < 3) >> + return false; >> + >> + if (rmap_item->skip_age == age) { >> + rmap_item->skip_age = 0; >> + return false; >> + } >> + >> + if (rmap_item->skip_age == 0) { >> + rmap_item->skip_age = age + inc_skip_age(age); >> + remove_rmap_item_from_tree(rmap_item); >> + } >> + >> + return true; >> +} > > Would a better name be should_skip_rmap_item()? > Renamed it to should_skip_rmap_item(). > But even that name implies that the function is idempotent (has no > side-effects). Again, an explanatory comment would be good. And > simple comments over each non-obvious `if' statement. > Added more comments to the function to explain the different cases. >> >> ... >>