Re: [PATCH v5 06/25] mm: Add PG_ARCH_2 page flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 24, 2020 at 11:33:07AM -0700, Andrew Morton wrote:
> On Wed, 24 Jun 2020 18:52:25 +0100 Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > From: Steven Price <steven.price@xxxxxxx>
> > For arm64 MTE support it is necessary to be able to mark pages that
> > contain user space visible tags that will need to be saved/restored e.g.
> > when swapped out.
> > 
> > To support this add a new arch specific flag (PG_ARCH_2) that arch code
> > can opt into using ARCH_USES_PG_ARCH_2.
> > 
> > ...
> >
> > --- a/fs/proc/page.c
> > +++ b/fs/proc/page.c
> > @@ -217,6 +217,9 @@ u64 stable_page_flags(struct page *page)
> >  	u |= kpf_copy_bit(k, KPF_PRIVATE_2,	PG_private_2);
> >  	u |= kpf_copy_bit(k, KPF_OWNER_PRIVATE,	PG_owner_priv_1);
> >  	u |= kpf_copy_bit(k, KPF_ARCH,		PG_arch_1);
> > +#ifdef CONFIG_ARCH_USES_PG_ARCH_2
> > +	u |= kpf_copy_bit(k, KPF_ARCH_2,	PG_arch_2);
> > +#endif
> 
> Do we need CONFIG_ARCH_USES_PG_ARCH_2?  What would be the downside to
> giving every architecture a PG_arch_2, but only arm64 uses it (at
> present)?

It turns out we have another issue with this flag. PG_arch_2 in the
arm64 MTE patches is used to mark a page as having valid tags. During
set_pte_at(), if the mapping type is tagged, we set PG_arch_2 (also
setting it in other cases like copy_page). In combination with THP and
swap (and some stress-testing to force swap-out), the kernel ends up
clearing PG_arch_2 in __split_huge_page_tail(), causing a subsequent
set_pte_at() to zero valid tags stored by user.

The quick fix is to add an arch_huge_page_flags_split_preserve macro
(need to think of a shorter name) which adds 1L << PG_arch_2 to the
preserve list in the above mentioned function. However, I wonder whether
it's safe to add both PG_arch_1 and PG_arch_2 to this list. At least on
arm and arm64, PG_arch_1 is used to mark a page as D-cache clean (and
don't need to do this again after splitting a pmd):

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 78c84bee7e29..22b3236a6dd8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2364,6 +2364,10 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_workingset) |
 			 (1L << PG_locked) |
 			 (1L << PG_unevictable) |
+			 (1L << PG_arch_1) |
+#ifdef CONFIG_64BIT
+			 (1L << PG_arch_2) |
+#endif
 			 (1L << PG_dirty)));
 
 	/* ->mapping in first tail page is compound_mapcount */

Thanks.

-- 
Catalin



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux