Re: [PATCH v7 1/2] arm64: Define Documentation/arm64/tagged-address-abi.rst

Catalin Marinas <catalin.marinas@xxxxxxx> · Mon, 12 Aug 2019 18:36:12 +0100

On Fri, Aug 09, 2019 at 07:10:18AM -0700, Dave Hansen wrote:
> On 8/8/19 10:27 AM, Catalin Marinas wrote:
> > On Wed, Aug 07, 2019 at 01:38:16PM -0700, Dave Hansen wrote:
> >> Also, shouldn't this be converted over to an arch_prctl()?
> > 
> > What do you mean by arch_prctl()? We don't have such thing, apart from
> > maybe arch_prctl_spec_ctrl_*(). We achieve the same thing with the
> > {SET,GET}_TAGGED_ADDR_CTRL macros. They could be renamed to
> > arch_prctl_tagged_addr_{set,get} or something but I don't see much
> > point.
> 
> Silly me.  We have an x86-specific:
> 
> 	SYSCALL_DEFINE2(arch_prctl, int , option, unsigned long , arg2)
> 
> I guess there's no ARM equivalent so you're stuck with the generic one.
> 
> > What would be better (for a separate patch series) is to clean up
> > sys_prctl() and move the arch-specific options into separate
> > arch_prctl() under arch/*/kernel/. But it's not really for this series.
> 
> I think it does make sense for truly arch-specific features to stay out
> of the arch-generic prctl().  Yes, I know I've personally violated this
> in the past. :)

Maybe Dave M could revive his prctl() clean-up patches which moves the
arch specific cases to the corresponding arch/*/ code

> >> What is the scope of these prctl()'s?  Are they thread-scoped or
> >> process-scoped?  Can two threads in the same process run with different
> >> tagging ABI modes?
> > 
> > Good point. They are thread-scoped and this should be made clear in the
> > doc. Two threads can have different modes.
> > 
> > The expectation is that this is invoked early during process start (by
> > the dynamic loader or libc init) while in single-thread mode and
> > subsequent threads will inherit the same mode. However, other uses are
> > possible.
> 
> If that's the expectation, it would be really nice to codify it.
> Basically, you can't enable the feature if another thread is already
> been forked off.

Well, that's my expectation but I'm not a userspace developer. I don't
think there is any good reason to prevent it. It doesn't cost us
anything to support in the kernel, other than making the documentation
clearer.

> >>> +When a process has successfully enabled the new ABI by invoking
> >>> +prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE), the following
> >>> +behaviours are guaranteed:
> >>> +
> >>> +- Every currently available syscall, except the cases mentioned in section
> >>> +  3, can accept any valid tagged pointer. The same rule is applicable to
> >>> +  any syscall introduced in the future.
> >>> +
> >>> +- The syscall behaviour is undefined for non valid tagged pointers.
> >>
> >> Do you really mean "undefined"?  I mean, a bad pointer is a bad pointer.
> >>  Why should it matter if it's a tagged bad pointer or an untagged bad
> >> pointer?
> > 
> > Szabolcs already replied here. We may have tagged pointers that can be
> > dereferenced just fine but being passed to the kernel may not be well
> > defined (e.g. some driver doing a find_vma() that fails unless it
> > explicitly untags the address). It's as undefined as the current
> > behaviour (without these patches) guarantees.
> 
> It might just be nicer to point out what this features *changes* about
> invalid pointer handling, which is nothing. :)  Maybe:
> 
> 	The syscall behaviour for invalid pointers is the same for both
> 	tagged and untagged pointers.

Good point.

> >>> +- prctl(PR_SET_MM, ``*``, ...) other than arg2 PR_SET_MM_MAP and
> >>> +  PR_SET_MM_MAP_SIZE.
> >>> +
> >>> +- prctl(PR_SET_MM, PR_SET_MM_MAP{,_SIZE}, ...) struct prctl_mm_map fields.
> >>> +
> >>> +Any attempt to use non-zero tagged pointers will lead to undefined
> >>> +behaviour.
> >>
> >> I wonder if you want to generalize this a bit.  I think you're saying
> >> that parts of the ABI that modify the *layout* of the address space
> >> never accept tagged pointers.
> > 
> > I guess our difficulty in specifying this may have been caused by
> > over-generalising. For example, madvise/mprotect came under the same
> > category but there is a use-case for malloc'ed pointers (and tagged) to
> > the kernel (e.g. MADV_DONTNEED). If we can restrict the meaning to
> > address space *layout* manipulation, we'd have mmap/mremap/munmap,
> > brk/sbrk, prctl(PR_SET_MM). Did I miss anything?. Other related syscalls
> > like mprotect/madvise preserve the layout while only changing permissions,
> > backing store, so the would be allowed to accept tags.
> 
> shmat() comes to mind.  I also did a quick grep for mmap_sem taken for
> write and didn't see anything else obvious jump out at me.

I'll document shmat() as not supported, together with the prctl().

As I submitted a fixup already, I propose that we allow tagged pointers
on mmap/munmap/mremap/brk. It makes the documentation simpler ;) (and
the user understanding of what is and is not allowed).

> >> It looks like the TAG_SHIFT and tag size are pretty baked into the
> >> aarch64 architecture.  But, are you confident that no future
> >> implementations will want different positions or sizes?  (obviously
> >> controlled by other TCR_EL1 bits)
> > 
> > For the top-byte-ignore (TBI), that's been baked in the architecture
> > since ARMv8.0 and we'll have to keep the backwards compatible mode. As
> > the name implies, it's the top byte of the address and that's what the
> > document above refers to.
> > 
> > With MTE, I can't exclude other configurations in the future but I'd
> > expect the kernel to present the option as a new HWCAP and the user to
> > explicitly opt in via a new prctl() flag. I seriously doubt we'd break
> > existing binaries. So, yes TAG_SHIFT may be different but so would the
> > prctl() above.
> 
> Basically, what you have is a "turn tagging on" and "turn tagging off"
> call which are binary: all on or all off.  How about exposing a mask:
> 
> 	/* Replace hard-coded mask size/position: */
> 	unsigned long mask = prctl(GET_POSSIBLE_TAGGED_ADDR_BITS);
> 
> 	if (mask == 0)
> 		// no tagging is supported obviously
> 
> 	prctl(SET_TAGGED_ADDR_BITS, mask);
> 
> 	// now userspace knows via 'mask' where the tag bits are

For the actual hardware memory tagging, maybe we could get the possible
bits but for TBI, as I said above, that's baked into the architecture. I
don't think it's worth the effort of getting a mask as I don't see ARM
changing this without breaking existing software. Even compiler support
like hwasan relies on the 8-bit TBI.

-- 
Catalin