On Mon, Jul 12, 2021, Dave Hansen wrote: > On 7/12/21 11:44 AM, Peter Gonda wrote: > >> +int psmash(struct page *page) > >> +{ > >> + unsigned long spa = page_to_pfn(page) << PAGE_SHIFT; > >> + int ret; > >> + > >> + if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > >> + return -ENXIO; > >> + > >> + /* Retry if another processor is modifying the RMP entry. */ > >> + do { > >> + /* Binutils version 2.36 supports the PSMASH mnemonic. */ > >> + asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF" > >> + : "=a"(ret) > >> + : "a"(spa) > >> + : "memory", "cc"); > >> + } while (ret == FAIL_INUSE); > > Should there be some retry limit here for safety? Or do we know that > > we'll never be stuck in this loop? Ditto for the loop in rmpupdate. > > It's probably fine to just leave this. While you could *theoretically* > lose this race forever, it's unlikely to happen in practice. If it > does, you'll get an easy-to-understand softlockup backtrace which should > point here pretty quickly. But should failure here even be tolerated? The TDX cases spin on flows that are _not_ due to (direct) contenion, e.g. a pending interrupt while flushing the cache or lack of randomness when generating a key. In this case, there are two CPUs racing to modify the RMP entry, which implies that the final state of the RMP entry is not deterministic. > I think TDX has a few of these as well. Most of the "SEAMCALL"s from > host to the firmware doing the security enforcement have something like > an -EBUSY as well. I believe they just retry forever too.