Re: [PATCH 4/8] arm64: Basic Branch Target Identification support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 24, 2019 at 03:53:06PM +0100, Dave Martin wrote:
> On Fri, May 24, 2019 at 02:02:17PM +0100, Mark Rutland wrote:
> > On Fri, May 24, 2019 at 11:25:29AM +0100, Dave Martin wrote:
> > > +#define arch_calc_vm_prot_bits(prot, pkey) arm64_calc_vm_prot_bits(prot)
> > > +static inline unsigned long arm64_calc_vm_prot_bits(unsigned long prot)
> > > +{
> > > +	if (system_supports_bti() && (prot & PROT_BTI_GUARDED))
> > > +		return VM_ARM64_GP;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +#define arch_vm_get_page_prot(vm_flags) arm64_vm_get_page_prot(vm_flags)
> > > +static inline pgprot_t arm64_vm_get_page_prot(unsigned long vm_flags)
> > > +{
> > > +	return (vm_flags & VM_ARM64_GP) ? __pgprot(PTE_GP) : __pgprot(0);
> > > +}
> > 
> > While the architectural name for the PTE bit is GP, it might make more
> > sense to call the vm flag VM_ARM64_BTI, since people are more likely to
> > recognise BTI than GP as a mnemonic.
> > 
> > Not a big deal either way, though.
> 
> I'm happy to change it.  It's a kernel internal flag used in
> approximately zero places.  So whatever name is most intuitive for
> kernel maintainers is fine.  Nobody else needs to look at it.

Sure thing; I just know that I'm going to remember what BTI is much more
easily than I'll remember what GP is.

> > > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> > > index b2de329..b868ef11 100644
> > > --- a/arch/arm64/include/asm/ptrace.h
> > > +++ b/arch/arm64/include/asm/ptrace.h
> > > @@ -41,6 +41,7 @@
> > >  
> > >  /* Additional SPSR bits not exposed in the UABI */
> > >  #define PSR_IL_BIT		(1 << 20)
> > > +#define PSR_BTYPE_CALL		(2 << 10)
> > 
> > I thought BTYPE was a 2-bit field, so isn't there at leat one other
> > value to have a mnemonic for?
> > 
> > Is it an enumeration or a bitmask?
> 
> It's a 2-bit enumeration, and for now this is the only value that the
> kernel uses: this determines the types of BTI landing pad permitted at
> signal handler entry points in BTI guarded pages.
> 
> Possibly it would be clearer to write it
> 
> #define PSR_BTYPE_CALL		(0b10 << 10)
> 
> but we don't write other ptrace.h constants this way.  In UAPI headers
> we should avoid GCC-isms, but here it's OK since we already rely on this
> syntax internally.
> 
> I can change it if you prefer, though my preference is to leave it.

I have no issue with the (2 << 10) form, but could we add mnemonics for
the other values now, even if we're not using them at this instant?

> > >  #endif /* _UAPI__ASM_HWCAP_H */
> > > diff --git a/arch/arm64/include/uapi/asm/mman.h b/arch/arm64/include/uapi/asm/mman.h
> > > new file mode 100644
> > > index 0000000..4776b43
> > > --- /dev/null
> > > +++ b/arch/arm64/include/uapi/asm/mman.h
> > > @@ -0,0 +1,9 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > > +#ifndef _UAPI__ASM_MMAN_H
> > > +#define _UAPI__ASM_MMAN_H
> > > +
> > > +#include <asm-generic/mman.h>
> > > +
> > > +#define PROT_BTI_GUARDED	0x10		/* BTI guarded page */
> > 
> > From prior discussions, I thought this would be PROT_BTI, without the
> > _GUARDED suffix. Do we really need that?
> > 
> > AFAICT, all other PROT_* definitions only have a single underscore, and
> > the existing arch-specific flags are PROT_ADI on sparc, and PROT_SAO on
> > powerpc.
> 
> No strong opinon.  I was trying to make the name less obscure, but I'm
> equally happy with PROT_BTI if people prefer that.

My personal opinion is that PROT_BTI is preferable, but I'll let others
chime in.

> > > diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> > > index b82e0a9..3717b06 100644
> > > --- a/arch/arm64/kernel/ptrace.c
> > > +++ b/arch/arm64/kernel/ptrace.c
> > > @@ -1860,7 +1860,7 @@ void syscall_trace_exit(struct pt_regs *regs)
> > >   */
> > >  #define SPSR_EL1_AARCH64_RES0_BITS \
> > >  	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 25) | GENMASK_ULL(23, 22) | \
> > > -	 GENMASK_ULL(20, 13) | GENMASK_ULL(11, 10) | GENMASK_ULL(5, 5))
> > > +	 GENMASK_ULL(20, 13) | GENMASK_ULL(5, 5))
> > >  #define SPSR_EL1_AARCH32_RES0_BITS \
> > >  	(GENMASK_ULL(63, 32) | GENMASK_ULL(22, 22) | GENMASK_ULL(20, 20))
> > 
> > Phew; I was worried this would be missed!
> 
> It was.  I had fun debugging that one :)
> 
> > > @@ -741,6 +741,11 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
> > >  	regs->regs[29] = (unsigned long)&user->next_frame->fp;
> > >  	regs->pc = (unsigned long)ka->sa.sa_handler;
> > >  
> > > +	if (system_supports_bti()) {
> > > +		regs->pstate &= ~(regs->pstate & PSR_BTYPE_MASK);
> > 
> > Nit: that can be:
> > 
> > 		regs->pstate &= ~PSR_BTYPE_MASK;
> 
> x & ~y is sensitive to the type of y and can clobber high bits, so I
> prefer not to write it.  GCC generates the same code either way.

Ah, I thought this might befor type promotion.

> However, this will also trip us up elsewhere when the time comes, so
> maybe it's a waste of time working around it here.
> 
> If you feel strongly, I'm happy to change it.

I'd rather we followed the same pattern as elsewhere, as having this
special case is confusing, and we'd still have the same bug elsewhere.

My concern here is consistency, so if you want to fix up all instances
to preserve the upper 32 bits of regs->pstate, I'd be happy. :)

I also think there are nicer/clearer ways to fix the type promotion
issue, like using UL in the field definitions, using explicit casts, or
adding helpers to set/clear bits with appropriate promotion.

> > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > index 5610ac0..85b456b 100644
> > > --- a/arch/arm64/kernel/syscall.c
> > > +++ b/arch/arm64/kernel/syscall.c
> > > @@ -66,6 +66,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > >  	unsigned long flags = current_thread_info()->flags;
> > >  
> > >  	regs->orig_x0 = regs->regs[0];
> > > +	regs->pstate &= ~(regs->pstate & PSR_BTYPE_MASK);
> > 
> > Likewise:
> > 
> > 	regs->pstate &= ~PSR_BTYPE_MASK;
> > 
> > ... though I don't understand why that would matter to syscalls, nor how
> > those bits could ever be set given we had to execute an SVC to get here.
> > 
> > What am I missing?
> 
> The behaviour is counterintuivite here.  The architecture guarantees to
> preserve BTYPE for traps, faults and asynchronous exceptions, but for a
> synchronous execption from normal architectural execution of an
> exception-generating instruction (SVC/HVC/SMC) the architecture leaves
> it IMP DEF whether BTYPE is preserved or zeroed in SPSR.

I'm still missing something here. IIUC were BTYPE was non-zero, we
should take the BTI trap before executing the SVC/HVC/SMC, right?

Otherwise, it would be possible to erroneously branch to an SVC/HVC/SMC,
which would logically violate the BTI protection.

If the assumption is that software can fix that case up, and the ??C
exception is prioritized above the BTI exception, then I think that we
should check whether it was permitted rather than silently fixing it up.

> I suppose precisely because there's only one way to reach the SVC
> handler, software knows for certain whether zero SPSR.BTYPE in that
> case.  So hardware doesn't need to do it.

As above, I thought BTYPE had to be zero in order for it to be possible
to execute the SVC/HVC/SMC, but there might be caveats.

Thanks,
Mark.



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux