On Tue, May 17, 2022 at 8:09 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > Hi, Hi Catalin, > That's more of an RFC to get a discussion started. I plan to eventually > apply the third patch reverting the page_kasan_tag_reset() calls under > arch/arm64 since they don't cover all cases (the race is rare and we > haven't hit anything yet but it's possible). > > On a system with MTE and KASAN_HW_TAGS enabled, when a page is allocated > kasan_unpoison_pages() sets a random tag and saves it in page->flags so > that page_to_virt() re-creates the correct tagged pointer. We need to > ensure that the in-memory tags are visible before setting the > page->flags: > > P0 (__kasan_unpoison_range): P1 (access via virt_to_page): > Wtags=x Rflags=x > | | > | DMB | address dependency > V V > Wflags=x Rtags=x This is confusing: the paragraph mentions page_to_virt() and the diagram - virt_to_page(). I assume it should be page_to_virt(). alloc_pages(), which calls kasan_unpoison_pages(), has to return before page_to_virt() can be called. So they only can race if the tags don't get propagated to memory before alloc_pages() returns, right? This is why you say that the race is rare? > The first patch changes the order of page unpoisoning with the tag > storing in page->flags. page_kasan_tag_set() has the right barriers > through try_cmpxchg(). [...] > If such page is mapped in user-space with PROT_MTE, the architecture > code will set the tag to 0 and a subsequent page_to_virt() dereference > will fault. We currently try to fix this by resetting the tag in > page->flags so that it is 0xff (match-all, not faulting). However, > setting the tags and flags can race with another CPU reading the flags > (page_to_virt()) and barriers can't help, e.g.: > > P0 (mte_sync_page_tags): P1 (memcpy from virt_to_page): > Rflags!=0xff > Wflags=0xff > DMB (doesn't help) > Wtags=0 > Rtags=0 // fault So this change, effectively, makes the tag in page->flags for GFP_USER pages to be reset at allocation time. And the current approach of resetting the tag when the kernel is about to access these pages is not good because: 1. it's inconvenient to track all places where this should be done and 2. the tag reset can race with page_to_virt() even with patch #1 applied. Is my understanding correct? This will reset the tags for all kinds of GFP_USER allocations, not only for the ones intended for MAP_ANONYMOUS and RAM-based file mappings, for which userspace can set tags, right? This will thus weaken in-kernel MTE for pages whose tags can't even be set by userspace. Is there a way to deal with this? > Since clearing the flags in the arch code doesn't work, try to do this > at page allocation time by a new flag added to GFP_USER. Could we > instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag? Why do we need a new flag? Can we just check & GFP_USER instead? Thanks!