On Fri, May 14, 2021 at 10:50:01AM +0100, Catalin Marinas wrote: > To ensure that instructions are observable in a new mapping, the arm64 > set_pte_at() implementation cleans the D-cache and invalidates the > I-cache to the PoU. As an optimisation, this is only done on executable > mappings and the PG_dcache_clean page flag is set to avoid future cache > maintenance on the same page. > > When two different processes map the same page (e.g. private executable > file or shared mapping) there's a potential race on checking and setting > PG_dcache_clean via set_pte_at() -> __sync_icache_dcache(). While on the > fault paths the page is locked (PG_locked), mprotect() does not take the > page lock. The result is that one process may see the PG_dcache_clean > flag set but the I/D cache maintenance not yet performed. > > Avoid test_and_set_bit(PG_dcache_clean) in favour of separate test_bit() > and set_bit(). In the rare event of a race, the cache maintenance is > done twice. > > Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > Cc: Steven Price <steven.price@xxxxxxx> > --- > > Found while debating with Steven a similar race on PG_mte_tagged. For > the latter we'll have to take a lock but hopefully in practice it will > only happen when restoring from swap. Separate thread anyway. > > There's at least arch/arm with a similar race. Powerpc seems to do it > properly with separate test/set. Other architectures have a bigger > problem as they do a similar check in update_mmu_cache(), called after > the pte was already exposed to user. > > I looked at fixing this in the mprotect() code but taking the page lock > will slow it down, so not sure how popular this would be for such a rare > race. > > arch/arm64/mm/flush.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c > index ac485163a4a7..6d44c028d1c9 100644 > --- a/arch/arm64/mm/flush.c > +++ b/arch/arm64/mm/flush.c > @@ -55,8 +55,10 @@ void __sync_icache_dcache(pte_t pte) > { > struct page *page = pte_page(pte); > > - if (!test_and_set_bit(PG_dcache_clean, &page->flags)) > + if (!test_bit(PG_dcache_clean, &page->flags)) { > sync_icache_aliases(page_address(page), page_size(page)); > + set_bit(PG_dcache_clean, &page->flags); > + } Acked-by: Will Deacon <will@xxxxxxxxxx> I wondered about the ISB for a bit (we don't broadcast it), but should be fine as the racing CPU needs to return to userspace. Will