Re: [PATCH 0/4] riscv: uaccess: optimizations

Mark Rutland <mark.rutland@xxxxxxx> · Mon, 8 Jul 2024 16:21:00 +0100

On Fri, Jul 05, 2024 at 10:58:29AM -0700, Linus Torvalds wrote:
> On Fri, 5 Jul 2024 at 04:25, Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > we'd probably want to use an address that lives between the two TTBRs
> > (i.e. in the "guard region" you mentioned above), just in case somebody
> > has fscked around with /proc/sys/vm/mmap_min_addr.
> 
> Yes, I don't want to use a NULL pointer and rely on mmap_min_addr.
> 
> For x86-64, we have two "guard regions" that can be used to generate
> an address that is guaranteed to fault:
> 
>  - the kernel always lives in the "top bit set" part of the address
> space (and any address tagging bits don't touch that part), and does
> not map the highest virtual address because that's used for error
> pointers, so the "all bits set" address always faults

The same should be true on arm64, though I'm not immediately sure if we
explicitly reserve that VA region -- if we don't, then we should.

>  - the region between valid user addresses and kernel addresses is
> also always going to fault, and we don't have them adjacent to each
> other (unlike, for example, 32-bit i386, where the kernel address
> space is directly adjacent to the top of user addresses)

Today we have a gap between the TTBR0 and TTBR1 VA ranges in all
configurations, but in future (with the new FEAT_D128 page table format)
we will have configurations where there's no gap between the two ranges.

> So on x86-64, the simple solution is to just say "we know if the top
> bit is clear, it cannot ever touch kernel code, and if the top bit is
> set we have to make the address fault". So just duplicating the top
> bit (with an arithmetic shift) and or'ing it with the low bits, we get
> exactly what we want.
> 
> But my knowledge of arm64 is weak enough that while I am reading
> assembly language and I know that instead of the top bit, it's bit55,
> I don't know what the actual rules for the translation table registers
> are.
> 
> If the all-bits-set address is guaranteed to always trap, then arm64
> could just use the same thing x86 does (just duplicating bit 55
> instead of the sign bit)?

I think something of that shape can work (see below). There are a couple
of things that make using all-ones unsafe:

1) Non-faulting parts of a misaligned load/store can occur *before* the
   fault is raised. If you have two pages where one of which is writable
   and the other of which is not writeable (in either order), a store
   which straddles those pages can write to the writeable page before
   raising a fault on the non-writeable page.

   I've seen this behaviour on real HW, and IIUC this is fairly common.

2) Loads/stores which wrap past 0xFFFF_FFFF_FFFF_FFFF access bytes at
   UNKNOWN addresses. An N-byte store at 0xFFFF_FFFF_FFFF_FFFF may write
   to N-1 bytes at an arbitrary address which is not
   0x0000_0000_0000_0000.

   In the latest ARM ARM (K.a), this is described tersely in section
   K1.2.9 "Out of range virtual address".

   That can be found at:

   https://developer.arm.com/documentation/ddi0487/ka/?lang=en

   I'm aware of implementation styles where that address is not zero and
   can be a TTBR1 (kernel) address.

Given that, we'd need to avoid all-ones, but provided we know that the
first access using the pointer will be limited to PAGE_SIZE bytes past
the pointer, we could round down the bad pointer to be somewhere within
the error pointer page, e.g.

	SBFX <mask>, <ptr>, #55, #1
	ORR  <ptr>, <ptr>, <mask>
	BIC <ptr>, <ptr>, <mask>, lsr #(64 - PAGE_SHIFT)

That last `BIC` instructions is "BIt Clear" AKA "AND NOT". When bit 55
is one that will clear the lower bits to round down to a page boundary,
and when bit 55 is zero it will have no effect (as it'll be an AND with
all-ones).

Thanks,
Mark.