Re: [RFC PATCH] arch: arm64: introduce RODATA_FULL_USE_PTE_CONT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 22, 2021 at 4:52 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Mon, 22 Nov 2021 at 06:29, Huangzhaoyang <huangzhaoyang@xxxxxxxxx> wrote:
> >
> > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> >
> > Kernel linear mapping will be split to the smallest granularity when
> > RODATA_FULL applied, which could lead to TLB pressure. Introduce a method
> > to apply PTE_CONT on pte.
> >
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
>
> How would this lead to TLB pressure, and how does the use of
> contiguous mappings mitigate that? The linear mapping of the kernel is
> rarely used, as all normal accesses to it go via the vmalloc region,
> so in which case would TLB entries be allocated for this region in a
> way that could cause a measurable performance impact?
In fact, the patch is about to use PTE_CONT *OUT OF* the range of
kernel text. Would you please have a look at the code. It apply
PTE_CONT during map_mem and then clear it when load_module change the
corresponding linear mapping to the area it use in vmalloc area.
>
>
> > ---
> >  arch/arm64/Kconfig       |  9 +++++++++
> >  arch/arm64/mm/mmu.c      | 10 ++++++++--
> >  arch/arm64/mm/pageattr.c |  9 +++++++++
> >  3 files changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index fee914c..3f8fbf0 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1198,6 +1198,15 @@ config RODATA_FULL_DEFAULT_ENABLED
> >           This requires the linear region to be mapped down to pages,
> >           which may adversely affect performance in some cases.
> >
> > +config RODATA_FULL_USE_PTE_CONT
> > +       bool "Apply PTE_CONT when RODATA_FULL_DEFAULT_ENABLED enabled"
> > +       depends on RODATA_FULL_DEFAULT_ENABLED
> > +       default y
> > +       help
> > +         Apply PTE_CONT on linear mapping as much as we can when
> > +         RODATA_FULL_DEFAULT_ENABLED enabled which could decrease the
> > +         impaction on performance by small pte granularity.
> > +
> >  config ARM64_SW_TTBR0_PAN
> >         bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
> >         help
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index cfd9deb..8017b17 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -124,15 +124,21 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
> >          * The following mapping attributes may be updated in live
> >          * kernel mappings without the need for break-before-make.
> >          */
> > +#ifndef CONFIG_RODATA_FULL_USE_PTE_CONT
> >         pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
> > +#else
> > +       pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG | PTE_CONT;
> > +#endif
> >
> >         /* creating or taking down mappings is always safe */
> >         if (old == 0 || new == 0)
> >                 return true;
> >
> >         /* live contiguous mappings may not be manipulated at all */
> > -       if ((old | new) & PTE_CONT)
> > +#ifndef CONFIG_RODATA_FULL_USE_PTE_CONT
> > +       if (old | new) & PTE_CONT)
> >                 return false;
> > +#endif
> >
> >         /* Transitioning from Non-Global to Global is unsafe */
> >         if (old & ~new & PTE_NG)
> > @@ -206,7 +212,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
> >
> >                 /* use a contiguous mapping if the range is suitably aligned */
> >                 if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) &&
> > -                   (flags & NO_CONT_MAPPINGS) == 0)
> > +                   (IS_ENABLED(CONFIG_RODATA_FULL_USE_PTE_CONT) || (flags & NO_CONT_MAPPINGS) == 0))
> >                         __prot = __pgprot(pgprot_val(prot) | PTE_CONT);
> >
> >                 init_pte(pmdp, addr, next, phys, __prot);
> > diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
> > index a3bacd7..88a87eb 100644
> > --- a/arch/arm64/mm/pageattr.c
> > +++ b/arch/arm64/mm/pageattr.c
> > @@ -99,6 +99,15 @@ static int change_memory_common(unsigned long addr, int numpages,
> >         if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
> >                             pgprot_val(clear_mask) == PTE_RDONLY)) {
> >                 for (i = 0; i < area->nr_pages; i++) {
> > +#ifdef CONFIG_RODATA_FULL_USE_PTE_CONT
> > +                       unsigned long cont_pte_low_bound;
> > +                       unsigned long addr;
> > +
> > +                       addr = (u64)page_address(area->pages[i]);
> > +                       cont_pte_low_bound = addr & CONT_PTE_MASK;
> > +                       __change_memory_common(cont_pte_low_bound,
> > +                                              (~CONT_PTE_MASK + 1), __pgprot(0) , __pgprot(PTE_CONT));
> > +#endif
> >                         __change_memory_common((u64)page_address(area->pages[i]),
> >                                                PAGE_SIZE, set_mask, clear_mask);
> >                 }
> > --
> > 1.9.1
> >




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux