Re: [PATCHv4 6/8] x86/mm: Provide helpers for unaccepted memory

"Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> · Wed, 13 Apr 2022 19:08:39 +0300

On Fri, Apr 08, 2022 at 12:21:19PM -0700, Dave Hansen wrote:
> On 4/5/22 16:43, Kirill A. Shutemov wrote:
> > +void accept_memory(phys_addr_t start, phys_addr_t end)
> > +{
> > +	unsigned long *unaccepted_memory;
> > +	unsigned long flags;
> > +	unsigned int rs, re;
> > +
> > +	if (!boot_params.unaccepted_memory)
> > +		return;
> > +
> > +	unaccepted_memory = __va(boot_params.unaccepted_memory);
> > +	rs = start / PMD_SIZE;
> > +
> > +	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> > +	for_each_set_bitrange_from(rs, re, unaccepted_memory,
> > +				   DIV_ROUND_UP(end, PMD_SIZE)) {
> > +		/* Platform-specific memory-acceptance call goes here */
> > +		panic("Cannot accept memory");
> > +		bitmap_clear(unaccepted_memory, rs, re - rs);
> > +	}
> > +	spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
> > +}
> 
> Just to reiterate: this is a global spinlock.  It's disabling
> interrupts.  "Platform-specific memory-acceptance call" will soon become:
> 
> 	tdx_accept_memory(rs * PMD_SIZE, re * PMD_SIZE);
> 
> which is a page-by-page __tdx_module_call():
> 
> > +	for (i = 0; i < (end - start) / PAGE_SIZE; i++) {
> > +		if (__tdx_module_call(TDACCEPTPAGE, start + i * PAGE_SIZE,
> > +				      0, 0, 0, NULL)) {
> > +			error("Cannot accept memory: page accept failed\n");
> > +		}
> > +	}
> 
> Each __tdx_module_call() involves a privilege transition that also
> surely includes things like changing CR3.  It can't be cheap.  It also
> is presumably touching the memory and probably flushing it out of the
> CPU caches.  It's also unbounded:
> 
> 	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> 	for (i = 0; i < (end - start) / PAGE_SIZE; i++)
> 		// thousands?  tens-of-thousands of cycles??
> 	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> 
> How far apart can end and start be?  It's at *least* 2MB in the page
> allocator, which is on the order of a millisecond.  Are we sure there
> aren't any callers that want to do this at a gigabyte granularity?  That
> would hold the global lock and disable interrupts on the order of a second.

This codepath only gets invoked with orders <MAX_ORDER or 4M on x86-64.

> Do we want to bound the time that the lock can be held?  Or, should we
> just let the lockup detectors tell us that we're being naughty?

Host can always DoS the guess, so yes this can lead to lockups.

-- 
 Kirill A. Shutemov