Re: [PATCH v2 0/5] Avoid building lrugen page table walk code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 6, 2023 at 12:21 AM Aneesh Kumar K.V
<aneesh.kumar@xxxxxxxxxxxxx> wrote:
>
> This patchset avoids building changes added by commit bd74fdaea146 ("mm:
> multi-gen LRU: support page table walks") on platforms that don't support
> hardware atomic updates of access bits.
>
> Aneesh Kumar K.V (5):
>   mm/mglru: Create a new helper iterate_mm_list_walk
>   mm/mglru: Move Bloom filter code around
>   mm/mglru: Move code around to make future patch easy
>   mm/mglru: move iterate_mm_list_walk Helper
>   mm/mglru: Don't build multi-gen LRU page table walk code on
>     architecture not supported
>
>  arch/Kconfig               |   3 +
>  arch/arm64/Kconfig         |   1 +
>  arch/x86/Kconfig           |   1 +
>  include/linux/memcontrol.h |   2 +-
>  include/linux/mm_types.h   |  10 +-
>  include/linux/mmzone.h     |  12 +-
>  kernel/fork.c              |   2 +-
>  mm/memcontrol.c            |   2 +-
>  mm/vmscan.c                | 955 +++++++++++++++++++------------------
>  9 files changed, 528 insertions(+), 460 deletions(-)

1. There is no need for a new Kconfig -- the condition is simply
   defined(CONFIG_LRU_GEN) && !defined(arch_has_hw_pte_young)

2. The best practice to disable static functions is not by macros but:

    static int const_cond(void)
    {
        return 1;
    }

    int main(void)
    {
        int a = const_cond();

        if (a)
                return 0;

        /* the compiler doesn't generate code for static funcs below */
        static_func_1();
        ...
        static_func_N();

LTO also optimizes external functions. But not everyone uses it. So we
still need macros for them, and of course data structures.

3. In 4/5, you have:

    @@ -461,6 +461,7 @@ enum {
     struct lru_gen_mm_state {
        /* set to max_seq after each iteration */
        unsigned long seq;
    +#ifdef CONFIG_LRU_TASK_PAGE_AGING
        /* where the current iteration continues after */
        struct list_head *head;
        /* where the last iteration ended before */
    @@ -469,6 +470,11 @@ struct lru_gen_mm_state {
        unsigned long *filters[NR_BLOOM_FILTERS];
        /* the mm stats for debugging */
        unsigned long stats[NR_HIST_GENS][NR_MM_STATS];
    +#else
    +   /* protect the seq update above */
    +   /* May be we can use lruvec->lock?  */
    +   spinlock_t lock;
    +#endif
     };

The answer is yes, and not only that, we don't need lru_gen_mm_state at all.

I'm attaching a patch that fixes all above. If you want to post it,
please feel free -- fully test it please, since I didn't. Otherwise I
can ask TJ to help make this work for you.

$ git diff --stat
 include/linux/memcontrol.h |   2 +-
 include/linux/mm_types.h   |  12 +-
 include/linux/mmzone.h     |   2 +
 kernel/bounds.c            |   6 +-
 kernel/fork.c              |   2 +-
 mm/vmscan.c                | 169 +++++++++++++++++++--------
 6 files changed, 137 insertions(+), 56 deletions(-)

On x86:

$ ./scripts/bloat-o-meter mm/vmscan.o.old mm/vmscan.o
add/remove: 24/34 grow/shrink: 2/7 up/down: 966/-8716 (-7750)
Function                                     old     new   delta
...
should_skip_vma                              206       -    -206
get_pte_pfn                                  261       -    -261
lru_gen_add_mm                               323       -    -323
lru_gen_seq_show                            1710    1370    -340
lru_gen_del_mm                               432       -    -432
reset_batch_size                             572       -    -572
try_to_inc_max_seq                          2947    1635   -1312
walk_pmd_range_locked                       1508       -   -1508
walk_pud_range                              3238       -   -3238
Total: Before=99449, After=91699, chg -7.79%

$ objdump -S mm/vmscan.o | grep -A 20 "<try_to_inc_max_seq>:"
000000000000a350 <try_to_inc_max_seq>:
{
    a350: e8 00 00 00 00        call   a355 <try_to_inc_max_seq+0x5>
    a355: 55                    push   %rbp
    a356: 48 89 e5              mov    %rsp,%rbp
    a359: 41 57                push   %r15
    a35b: 41 56                push   %r14
    a35d: 41 55                push   %r13
    a35f: 41 54                push   %r12
    a361: 53                    push   %rbx
    a362: 48 83 ec 70          sub    $0x70,%rsp
    a366: 41 89 d4              mov    %edx,%r12d
    a369: 49 89 f6              mov    %rsi,%r14
    a36c: 49 89 ff              mov    %rdi,%r15
spin_lock_irq(&lruvec->lru_lock);
    a36f: 48 8d 5f 50          lea    0x50(%rdi),%rbx
    a373: 48 89 df              mov    %rbx,%rdi
    a376: e8 00 00 00 00        call   a37b <try_to_inc_max_seq+0x2b>
success = max_seq == lrugen->max_seq;
    a37b: 49 8b 87 88 00 00 00 mov    0x88(%r15),%rax
    a382: 4c 39 f0              cmp    %r14,%rax

Attachment: no_mm_walk.patch
Description: Binary data


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux