Re: [PATCH v8 04/12] arm64: mm: Pin down ASIDs for sharing mm with devices

Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> · Thu, 16 Jul 2020 17:44:19 +0200

Hi,

Thanks for reviewing, I think these 3 or 4 patches are the trickiest in
the whole SVA series

On Mon, Jul 13, 2020 at 04:46:23PM +0100, Will Deacon wrote:
> On Thu, Jun 18, 2020 at 05:51:17PM +0200, Jean-Philippe Brucker wrote:
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index 68140fdd89d6b..bbdd291e31d59 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -19,6 +19,7 @@
> >  
> >  typedef struct {
> >  	atomic64_t	id;
> > +	unsigned long	pinned;
> 
> bool?

It's a refcount. I'll change it to refcount_t because it looks nicer
overall, even though it doesn't need to be atomic.

> >  static void set_kpti_asid_bits(void)
> >  {
> > +	unsigned int k;
> > +	u8 *dst = (u8 *)asid_map;
> > +	u8 *src = (u8 *)pinned_asid_map;
> >  	unsigned int len = BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(unsigned long);
> >  	/*
> >  	 * In case of KPTI kernel/user ASIDs are allocated in
> > @@ -81,7 +88,8 @@ static void set_kpti_asid_bits(void)
> >  	 * is set, then the ASID will map only userspace. Thus
> >  	 * mark even as reserved for kernel.
> >  	 */
> > -	memset(asid_map, 0xaa, len);
> > +	for (k = 0; k < len; k++)
> > +		dst[k] = src[k] | 0xaa;
> 
> Can you use __bitmap_replace() here? I think it would be clearer to use the
> bitmap API wherever possible, since casting 'unsigned long *' to 'u8 *'
> just makes me worry about endianness issues (although in this case I don't
> hink it's a problem).

I can do better actually: initialize pinned_asid_map with the kernel ASID
bits at boot and just copy the bitmap on rollover.

> > +unsigned long mm_context_get(struct mm_struct *mm)
> > +{
> > +	unsigned long flags;
> > +	u64 asid;
> > +
> > +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> > +
> > +	asid = atomic64_read(&mm->context.id);
> > +
> > +	if (mm->context.pinned) {
> > +		mm->context.pinned++;
> > +		asid &= ~ASID_MASK;
> > +		goto out_unlock;
> > +	}
> > +
> > +	if (nr_pinned_asids >= max_pinned_asids) {
> > +		asid = 0;
> > +		goto out_unlock;
> > +	}
> > +
> > +	if (!asid_gen_match(asid)) {
> > +		/*
> > +		 * We went through one or more rollover since that ASID was
> > +		 * used. Ensure that it is still valid, or generate a new one.
> > +		 */
> > +		asid = new_context(mm);
> > +		atomic64_set(&mm->context.id, asid);
> > +	}
> > +
> > +	asid &= ~ASID_MASK;
> > +
> > +	nr_pinned_asids++;
> > +	__set_bit(asid2idx(asid), pinned_asid_map);
> > +	mm->context.pinned++;
> > +
> > +out_unlock:
> > +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
> 
> Maybe stick the & ~ASID_MASK here so it's easier to read?
> 
> > +	/* Set the equivalent of USER_ASID_BIT */
> > +	if (asid && IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
> > +		asid |= 1;
> > +
> > +	return asid;
> > +}
> > +EXPORT_SYMBOL_GPL(mm_context_get);
> 
> That's quite a generic symbol name to export... maybe throw 'arm64_' in
> front of it?
> 
> > +
> > +void mm_context_put(struct mm_struct *mm)
> > +{
> > +	unsigned long flags;
> > +	u64 asid = atomic64_read(&mm->context.id) & ~ASID_MASK;
> 
> I don't think you need the masking here.

Right, asid2idx() takes care of it.

> 
> > +	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
> > +
> > +	if (--mm->context.pinned == 0) {
> > +		__clear_bit(asid2idx(asid), pinned_asid_map);
> > +		nr_pinned_asids--;
> > +	}
> > +
> > +	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
> > +}
> > +EXPORT_SYMBOL_GPL(mm_context_put);
> 
> Same naming comment here.
> 
> >  /* Errata workaround post TTBRx_EL1 update. */
> >  asmlinkage void post_ttbr_update_workaround(void)
> >  {
> > @@ -303,6 +381,13 @@ static int asids_update_limit(void)
> >  	WARN_ON(num_available_asids - 1 <= num_possible_cpus());
> >  	pr_info("ASID allocator initialised with %lu entries\n",
> >  		num_available_asids);
> > +
> > +	/*
> > +	 * We assume that an ASID is always available after a rollover. This
> > +	 * means that even if all CPUs have a reserved ASID, there still is at
> > +	 * least one slot available in the asid map.
> > +	 */
> > +	max_pinned_asids = num_available_asids - num_possible_cpus() - 2;
> 
> Is it worth checking that assumption, rather than setting max_pinned_asids
> to a massive value?

The comment isn't right, it should be something like:

	"There must always be an ASID available after a rollover. Ensure
	that, even if all CPUs have a reserved ASID and the maximum number
	of ASIDs are pinned, there still is at least one empty slot in the
	ASID map."

I do wonder if we should reduce max_pinned_asids, because the system will
probably become unusable if it gets close to a single available ASID.  But
I think we'll need to introduce higher-level controls for SVA such as
cgroups to prevent users from pinning too many ASIDs.

> 
> >  	return 0;
> >  }
> >  arch_initcall(asids_update_limit);
> > @@ -317,6 +402,12 @@ static int asids_init(void)
> >  		panic("Failed to allocate bitmap for %lu ASIDs\n",
> >  		      NUM_USER_ASIDS);
> >  
> > +	pinned_asid_map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS),
> > +				  sizeof(*pinned_asid_map), GFP_KERNEL);
> > +	if (!pinned_asid_map)
> > +		panic("Failed to allocate pinned ASID bitmap\n");
> 
> Why can't we continue in this case without support for pinned ASIDs?

Good idea, I'll change that

Thanks,
Jean