Re: [PATCH 0/3] kasan: Fix ordering between MTE tag colouring and page->flags

Andrey Konovalov <andreyknvl@xxxxxxxxx> · Thu, 19 May 2022 23:45:04 +0200

On Tue, May 17, 2022 at 8:09 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> Hi,

Hi Catalin,

> That's more of an RFC to get a discussion started. I plan to eventually
> apply the third patch reverting the page_kasan_tag_reset() calls under
> arch/arm64 since they don't cover all cases (the race is rare and we
> haven't hit anything yet but it's possible).
>
> On a system with MTE and KASAN_HW_TAGS enabled, when a page is allocated
> kasan_unpoison_pages() sets a random tag and saves it in page->flags so
> that page_to_virt() re-creates the correct tagged pointer. We need to
> ensure that the in-memory tags are visible before setting the
> page->flags:
>
> P0 (__kasan_unpoison_range):    P1 (access via virt_to_page):
>   Wtags=x                         Rflags=x
>     |                               |
>     | DMB                           | address dependency
>     V                               V
>   Wflags=x                        Rtags=x

This is confusing: the paragraph mentions page_to_virt() and the
diagram - virt_to_page(). I assume it should be page_to_virt().

alloc_pages(), which calls kasan_unpoison_pages(), has to return
before page_to_virt() can be called. So they only can race if the tags
don't get propagated to memory before alloc_pages() returns, right?
This is why you say that the race is rare?

> The first patch changes the order of page unpoisoning with the tag
> storing in page->flags. page_kasan_tag_set() has the right barriers
> through try_cmpxchg().

[...]

> If such page is mapped in user-space with PROT_MTE, the architecture
> code will set the tag to 0 and a subsequent page_to_virt() dereference
> will fault. We currently try to fix this by resetting the tag in
> page->flags so that it is 0xff (match-all, not faulting). However,
> setting the tags and flags can race with another CPU reading the flags
> (page_to_virt()) and barriers can't help, e.g.:
>
> P0 (mte_sync_page_tags):        P1 (memcpy from virt_to_page):
>                                   Rflags!=0xff
>   Wflags=0xff
>   DMB (doesn't help)
>   Wtags=0
>                                   Rtags=0   // fault

So this change, effectively, makes the tag in page->flags for GFP_USER
pages to be reset at allocation time. And the current approach of
resetting the tag when the kernel is about to access these pages is
not good because: 1. it's inconvenient to track all places where this
should be done and 2. the tag reset can race with page_to_virt() even
with patch #1 applied. Is my understanding correct?

This will reset the tags for all kinds of GFP_USER allocations, not
only for the ones intended for MAP_ANONYMOUS and RAM-based file
mappings, for which userspace can set tags, right? This will thus
weaken in-kernel MTE for pages whose tags can't even be set by
userspace. Is there a way to deal with this?

> Since clearing the flags in the arch code doesn't work, try to do this
> at page allocation time by a new flag added to GFP_USER. Could we
> instead add __GFP_SKIP_KASAN_UNPOISON rather than a new flag?

Why do we need a new flag? Can we just check & GFP_USER instead?

Thanks!