Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

Khalid Aziz <khalid.aziz@xxxxxxxxxx> · Thu, 4 Apr 2019 09:15:46 -0600

On 4/4/19 1:43 AM, Peter Zijlstra wrote:
> 
> You must be so glad I no longer use kmap_atomic from NMI context :-)
> 
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
>> +static inline void xpfo_kmap(void *kaddr, struct page *page)
>> +{
>> +	unsigned long flags;
>> +
>> +	if (!static_branch_unlikely(&xpfo_inited))
>> +		return;
>> +
>> +	if (!PageXpfoUser(page))
>> +		return;
>> +
>> +	/*
>> +	 * The page was previously allocated to user space, so
>> +	 * map it back into the kernel if needed. No TLB flush required.
>> +	 */
>> +	spin_lock_irqsave(&page->xpfo_lock, flags);
>> +
>> +	if ((atomic_inc_return(&page->xpfo_mapcount) == 1) &&
>> +		TestClearPageXpfoUnmapped(page))
>> +		set_kpte(kaddr, page, PAGE_KERNEL);
>> +
>> +	spin_unlock_irqrestore(&page->xpfo_lock, flags);
> 
> That's a really sad sequence, not wrong, but sad. _3_ atomic operations,
> 2 on likely the same cacheline. And mostly all pointless.
> 
> This patch makes xpfo_mapcount an atomic, but then all modifications are
> under the spinlock, what gives?
> 
> Anyway, a possibly saner sequence might be:
> 
> 	if (atomic_inc_not_zero(&page->xpfo_mapcount))
> 		return;
> 
> 	spin_lock_irqsave(&page->xpfo_lock, flag);
> 	if ((atomic_inc_return(&page->xpfo_mapcount) == 1) &&
> 	    TestClearPageXpfoUnmapped(page))
> 		set_kpte(kaddr, page, PAGE_KERNEL);
> 	spin_unlock_irqrestore(&page->xpfo_lock, flags);
> 
>> +}
>> +
>> +static inline void xpfo_kunmap(void *kaddr, struct page *page)
>> +{
>> +	unsigned long flags;
>> +
>> +	if (!static_branch_unlikely(&xpfo_inited))
>> +		return;
>> +
>> +	if (!PageXpfoUser(page))
>> +		return;
>> +
>> +	/*
>> +	 * The page is to be allocated back to user space, so unmap it from
>> +	 * the kernel, flush the TLB and tag it as a user page.
>> +	 */
>> +	spin_lock_irqsave(&page->xpfo_lock, flags);
>> +
>> +	if (atomic_dec_return(&page->xpfo_mapcount) == 0) {
>> +#ifdef CONFIG_XPFO_DEBUG
>> +		WARN_ON(PageXpfoUnmapped(page));
>> +#endif
>> +		SetPageXpfoUnmapped(page);
>> +		set_kpte(kaddr, page, __pgprot(0));
>> +		xpfo_flush_kernel_tlb(page, 0);
> 
> You didn't speak about the TLB invalidation anywhere. But basically this
> is one that x86 cannot do.
> 
>> +	}
>> +
>> +	spin_unlock_irqrestore(&page->xpfo_lock, flags);
> 
> Idem:
> 
> 	if (atomic_add_unless(&page->xpfo_mapcount, -1, 1))
> 		return;
> 
> 	....
> 
> 
>> +}
> 
> Also I'm failing to see the point of PG_xpfo_unmapped, afaict it
> is identical to !atomic_read(&page->xpfo_mapcount).
> 

Thanks Peter. I really appreciate your review. Your feedback helps make
this code better and closer to where I can feel comfortable not calling
it RFC any more.

The more I look at xpfo_kmap()/xpfo_kunmap() code, the more I get
uncomfortable with it. As you pointed out about calling kmap_atomic from
NMI context, that just makes the kmap_atomic code look even worse. I
pointed out one problem with this code in cover letter and suggested a
rewrite. I see these problems with this code:

1. When xpfo_kmap maps a page back in physmap, it opens up the ret2dir
attack security hole again even if just for the duration of kmap. A kmap
can stay around for some time if the page is being used for I/O.

2. This code uses spinlock which leads to problems. If it does not
disable IRQ, it is exposed to deadlock around xpfo_lock. If it disables
IRQ, I think it can still deadlock around pgd_lock.

I think a better implementation of xpfo_kmap()/xpfo_kunmap() would map
the page at a new virtual address similar to what kmap_high for i386
does. This avoids re-opening the ret2dir security hole. We can also
possibly do away with xpfo_lock saving bytes in page-frame and the not
so sane code sequence can go away.

Good point about PG_xpfo_unmapped flag. You are right that it can be
replaced with !atomic_read(&page->xpfo_mapcount).

Thanks,
Khalid