On 4/5/22 16:43, Kirill A. Shutemov wrote: > +void accept_memory(phys_addr_t start, phys_addr_t end) > +{ > + unsigned long *unaccepted_memory; > + unsigned long flags; > + unsigned int rs, re; > + > + if (!boot_params.unaccepted_memory) > + return; > + > + unaccepted_memory = __va(boot_params.unaccepted_memory); > + rs = start / PMD_SIZE; > + > + spin_lock_irqsave(&unaccepted_memory_lock, flags); > + for_each_set_bitrange_from(rs, re, unaccepted_memory, > + DIV_ROUND_UP(end, PMD_SIZE)) { > + /* Platform-specific memory-acceptance call goes here */ > + panic("Cannot accept memory"); > + bitmap_clear(unaccepted_memory, rs, re - rs); > + } > + spin_unlock_irqrestore(&unaccepted_memory_lock, flags); > +} Just to reiterate: this is a global spinlock. It's disabling interrupts. "Platform-specific memory-acceptance call" will soon become: tdx_accept_memory(rs * PMD_SIZE, re * PMD_SIZE); which is a page-by-page __tdx_module_call(): > + for (i = 0; i < (end - start) / PAGE_SIZE; i++) { > + if (__tdx_module_call(TDACCEPTPAGE, start + i * PAGE_SIZE, > + 0, 0, 0, NULL)) { > + error("Cannot accept memory: page accept failed\n"); > + } > + } Each __tdx_module_call() involves a privilege transition that also surely includes things like changing CR3. It can't be cheap. It also is presumably touching the memory and probably flushing it out of the CPU caches. It's also unbounded: spin_lock_irqsave(&unaccepted_memory_lock, flags); for (i = 0; i < (end - start) / PAGE_SIZE; i++) // thousands? tens-of-thousands of cycles?? spin_lock_irqsave(&unaccepted_memory_lock, flags); How far apart can end and start be? It's at *least* 2MB in the page allocator, which is on the order of a millisecond. Are we sure there aren't any callers that want to do this at a gigabyte granularity? That would hold the global lock and disable interrupts on the order of a second. Do we want to bound the time that the lock can be held? Or, should we just let the lockup detectors tell us that we're being naughty?