Re: [RFC PATCH 3/6] mm, arm64: untag user addresses in memory syscalls

Evgenii Stepanov <eugenis@xxxxxxxxxx> · Thu, 15 Mar 2018 18:11:46 -0700

On Wed, Mar 14, 2018 at 10:44 AM, Catalin Marinas
<catalin.marinas@xxxxxxx> wrote:
> On Wed, Mar 14, 2018 at 04:45:20PM +0100, Andrey Konovalov wrote:
>> On Fri, Mar 9, 2018 at 6:42 PM, Evgenii Stepanov <eugenis@xxxxxxxxxx> wrote:
>> > On Fri, Mar 9, 2018 at 9:31 AM, Andrey Konovalov <andreyknvl@xxxxxxxxxx> wrote:
>> >> On Fri, Mar 9, 2018 at 4:53 PM, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>> >>> I'm not yet convinced these functions need to allow tagged pointers.
>> >>> They are not doing memory accesses but rather dealing with the memory
>> >>> range, hence an untagged pointer is better suited. There is probably a
>> >>> reason why the "start" argument is "unsigned long" vs "void __user *"
>> >>> (in the kernel, not the man page).
>> >>
>> >> So that would make the user to untag pointers before passing to these syscalls.
>> >>
>> >> Evgeniy, would that be possible to untag pointers in HWASan before
>> >> using memory subsystem syscalls? Is there a reason for untagging them
>> >> in the kernel?
>> >
>> > Generally, no. It's possible to intercept a libc call using symbol
>> > interposition, but I don't know how to rewrite arguments of a raw
>> > system call other than through ptrace, and that creates more problems
>> > than it solves.
>
> With these patches, we are trying to relax the user/kernel ABI so that
> tagged pointers can be passed into the kernel. Since this is a new ABI
> (or an extension to the existing one), it might be ok to change the libc
> so that the top byte is zeroed on specific syscalls before issuing the
> SVC.
>
> I agree that it is problematic for HWASan if it only relies on
> overriding malloc/free.
>
>> > AFAIU, it's valid for a program to pass an address obtained from
>> > malloc or, better, posix_memalign to an mm syscall like mprotect().
>> > These arguments are pointers from the userspace point of view.
>>
>> Catalin, do you think this is a good reason to have the untagging done
>> in the kernel?
>
> malloc() or posix_memalign() are C library implementations and it's the
> C library (or overridden functions) setting a tag on the returned
> pointers. Since the TBI hardware feature allows memory accesses with a
> non-zero tag, we could allow them in the kernel for syscalls performing
> such accesses on behalf of the user (e.g. get_user/put_user would not
> need to clear the tag).
>
> madvise(), OTOH, does not perform a memory access on behalf of the user,
> it's just advising the kernel about a range of virtual addresses. That's
> where I think, from an ABI perspective, it doesn't make much sense to
> allow tags into the kernel for these syscalls (even if it's simpler from
> a user space perspective).
>
> (but I don't have a very strong opinion on this ;))

I don't have a strong opinion on this, either.
Ideally, I would like tags to be fully transparent for user space
code. MM syscalls used on a malloc/memalign address are not a very
common pattern, so it might be OK to not allow tags there. But all
such code will have to be changed with explicit knowledge of TBI.