On Fri, Apr 21, 2023 at 10:20 AM Peter Collingbourne <pcc@xxxxxxxxxx> wrote: > > On Fri, Apr 21, 2023 at 5:24 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > > > On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote: > > > Consider the following sequence of events: > > > > > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > > > 2) Page migration allocates a page with the KASAN allocator, > > > causing it to receive a non-match-all tag, and uses it > > > to replace the page faulted in 1. > > > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. > > > > Ah, so there is no race here, it's simply because the page allocation > > for migration has a non-match-all kasan tag in page->flags. > > > > How do we handle the non-migration case with mprotect()? IIRC > > post_alloc_hook() always resets the page->flags since > > GFP_HIGHUSER_MOVABLE has the __GFP_SKIP_KASAN_UNPOISON flag. > > Yes, that's how it normally works. > > > > As a result of step 3, we are left with a non-match-all tag for a page > > > with tags accessible to userspace, which can lead to the same kind of > > > tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in > > > copy_highpage with HW tags only") intended to fix. > > > > > > The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED > > > is that they cannot have a non-match-all tag. As a result of step 2, the > > > invariant is broken. This means that the fix in the referenced commit > > > was incomplete and we also need to reset the tag for pages without > > > PG_mte_tagged. > > > > > > Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") > > > > This commit was reverted in 20794545c146 (arm64: kasan: Revert "arm64: > > mte: reset the page tag in page->flags"). It looks a bit strange to fix > > it up. > > It does seem strange but I think it is correct because that is when > the bug (resetting tag only if PG_mte_tagged) was introduced. The > revert preserved the bug because it did not account for the migration > case, which means that it didn't account for migration+mprotect > either. > > > > diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c > > > index 4aadcfb01754..a7bb20055ce0 100644 > > > --- a/arch/arm64/mm/copypage.c > > > +++ b/arch/arm64/mm/copypage.c > > > @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from) > > > > > > copy_page(kto, kfrom); > > > > > > + if (kasan_hw_tags_enabled()) > > > + page_kasan_tag_reset(to); > > > + > > > if (system_supports_mte() && page_mte_tagged(from)) { > > > - if (kasan_hw_tags_enabled()) > > > - page_kasan_tag_reset(to); > > > > This should work but can we not do this at allocation time like we do > > for the source page and remove any page_kasan_tag_reset() here > > altogether? > > That would be difficult because of the number of different ways that > the page can be allocated. That's why we also decided to reset it here > in commit e74a68468062. Ping. Peter