Re: [PATCH v2 1/9] kasan: sw_tags: Use arithmetic shift for shadow computation

Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx> · Fri, 14 Feb 2025 09:20:19 +0100

On 2025-02-13 at 17:20:22 +0100, Maciej Wieczor-Retman wrote:
>On 2025-02-13 at 02:28:08 +0100, Andrey Konovalov wrote:
>>On Thu, Feb 13, 2025 at 2:21 AM Andrey Konovalov <andreyknvl@xxxxxxxxx> wrote:
>>>
>>> On Tue, Feb 11, 2025 at 7:07 PM Maciej Wieczor-Retman
>>> <maciej.wieczor-retman@xxxxxxxxx> wrote:
>>> >
>>> > I did some experiments with multiple addresses passed through
>>> > kasan_mem_to_shadow(). And it seems like we can get almost any address out when
>>> > we consider any random bogus pointers.
>>> >
>>> > I used the KASAN_SHADOW_OFFSET from your example above. Userspace addresses seem
>>> > to map to the range [KASAN_SHADOW_OFFSET - 0xffff8fffffffffff]. Then going
>>> > through non-canonical addresses until 0x0007ffffffffffff we reach the end of
>>> > kernel LA and we loop around. Then the addresses seem to go from 0 until we
>>> > again start reaching the kernel space and then it maps into the proper shadow
>>> > memory.
>>> >
>>> > It gave me the same results when using the previous version of
>>> > kasan_mem_to_shadow() so I'm wondering whether I'm doing this experiment
>>> > incorrectly or if there aren't any addresses we can rule out here?
>>>
>>> By the definition of the shadow mapping, if we apply that mapping to
>>> the whole 64-bit address space, the result will only contain 1/8th
>>> (1/16th for SW/HW_TAGS) of that space.
>>>
>>> For example, with the current upstream value of KASAN_SHADOW_OFFSET on
>>> x86 and arm64, the value of the top 3 bits (4 for SW/HW_TAGS) of any
>>> shadow address are always the same: KASAN_SHADOW_OFFSET's value is
>>> such that the shadow address calculation never overflows. Addresses
>>> that have a different value for those top 3 bits are the once we can
>>> rule out.
>>
>>Eh, scratch that, the 3rd bit from the top changes, as
>>KASAN_SHADOW_OFFSET is not a that-well-aligned value, the overall size
>>of the mapping holds.
>>
>>> The KASAN_SHADOW_OFFSET value from my example does rely on the
>>> overflow (arguably, this makes things more confusing [1]). But still,
>>> the possible values of shadow addresses should only cover 1/16th of
>>> the address space.
>>>
>>> So whether the address belongs to that 1/8th (1/16th) of the address
>>> space is what we want to check in kasan_non_canonical_hook().
>>>
>
>Right, I somehow forgot that obviously the whole LA has to map to 1/16th of the
>address space and it shold stay contiguous.
>
>After rethinking how the mapping worked before and will work after making stuff
>signed I thought this patch could make use of the overflow?
>
>From what I noticed, all the Kconfig values for KASAN_SHADOW_OFFSET should make
>it so there will be overflow when inputing more and more positive addresses.
>
>So maybe we should first find what the most negative and most positive (signed)
>addresses map to in shadow memory address space. And then when looking for
>invalid values that aren't the product of kasan_mem_to_shadow() we should check
>
>	if (addr > kasan_mem_to_shadow(biggest_positive_address) &&
>	    addr < kasan_mem_to_shadow(smallest_negative_address))
>		return;
>
>Is this correct?

I suppose the original code in the patch does the same thing when you change the
|| into &&:

	if (addr < KASAN_SHADOW_OFFSET - max_shadow_size / 2 &&
	    addr >= KASAN_SHADOW_OFFSET + max_shadow_size / 2)
		return;

kasan_mem_to_shadow(0x7FFFFFFFFFFFFFFF) -> 0x07ff7fffffffffff
kasan_mem_to_shadow(0x8000000000000000) -> 0xf7ff800000000000

Also after thinking about this overflow and what maps where I rechecked the
kasan_shadow_to_mem() and addr_has_metadata() and they seem to return the values
I'd expect without making any changes there. Just mentioning this because I
recall you asked about it at the start of this thread.

-- 
Kind regards
Maciej Wieczór-Retman