On Tue, 17 Jan 2023 15:55:53 +0000 Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > On Sat, Jan 14, 2023, Zhi Wang wrote: > > On Fri, 13 Jan 2023 15:16:08 +0000 > Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > > On Fri, Jan 13, 2023, Zhi Wang wrote: > > > > Better add a FIXME: here as this has to be fixed later. > > > > > > No, leaking the page is all KVM can reasonably do here. An improved > > > comment would be helpful, but no code change is required. > > > tdx_reclaim_page() returns an error if and only if there's an > > > unexpected, fatal error, e.g. a SEAMCALL with bad params, incorrect > > > concurrency in KVM, a TDX Module bug, etc. Retrying at a later point is > > > highly unlikely to be successful. > > > > Hi: > > > > The word "leaking" sounds like a situation left unhandled temporarily. > > > > I checked the source code of the TDX module[1] for the possible reason to > > fail when reviewing this patch: > > > > tdx-module-v1.0.01.01.zip\src\vmm_dispatcher\api_calls\tdh_phymem_page_reclaim.c > > tdx-module-v1.0.01.01.zip\src\vmm_dispatcher\api_calls\tdh_phymem_page_wbinvd.c > > > > a. Invalid parameters. For example, page is not aligned, PA HKID is not zero... > > > > For invalid parameters, a WARN_ON_ONCE() + return value is good enough as > > that is how kernel handles similar situations. The caller takes the > > responsibility. > > > > b. Locks has been taken in TDX module. TDR page has been locked due to another > > SEAMCALL, another SEAMCALL is doing PAMT walk and holding PAMT lock... > > > > This needs to be improved later either by retry or taking tdx_lock to avoid > > TDX module fails on this. > > No, tdx_reclaim_page() already retries TDH.PHYMEM.PAGE.RECLAIM if the target page > is contended (though I'd question the validity of even that), and TDH.PHYMEM.PAGE.WBINVD > is performed only when reclaiming the TDR. If there's contention when reclaiming > the TDR, then KVM effectively has a use-after-free bug, i.e. leaking the page is > the least of our worries. > Hi: Thanks for the reply. "Leaking" is the consquence of even failing in retry. I agree with this. But I was questioning if "retry" is really a correct and only solution when encountering lock contention in the TDX module as I saw that there are quite some magic numbers are going to be introduced because of "retry" and there were discussions about times of retry should be 3 or 1000 in TDX guest on hyper-V patches. It doesn't sound right. Compare to an typical *kernel lock* case, an execution path can wait on a waitqueue and later will be woken up. We usually do contention-wait-and-retry and we rarely just do contention and retry X times. In TDX case, I understand that it is hard for the TDX module to provide similar solutions as an execution path can't stay long in the TDX module. 1) We can always take tdx_lock (linux kernel lock) when calling a SEAMCALL that touch the TDX internal locks. But the downside is we might lose some concurrency. 2) As TDX module doesn't provide contention-and-wait, I guess the following approach might have been discussed when designing this "retry". KERNEL TDX MODULE SEAMCALL A -> PATH A: Taking locks SEAMCALL B -> PATH B: Contention on a lock <- Return "operand busy" SEAMCALL B -| | <- Wait on a kernel waitqueue SEAMCALL B <-| SEAMCALL A <- PATH A: Return SEAMCALL A -| | <- Wake up the waitqueue SEMACALL A <-| SEAMCALL B -> PATH B: Taking the locks ... Why not this scheme wasn't chosen? > > On Thu, Jan 12, 2023 at 8:34 AM <isaku.yamahata@xxxxxxxxx> wrote: > > +static int tdx_reclaim_page(hpa_t pa, bool do_wb, u16 hkid) > > +{ > > + struct tdx_module_output out; > > + u64 err; > > + > > + do { > > + err = tdh_phymem_page_reclaim(pa, &out); > > + /* > > + * TDH.PHYMEM.PAGE.RECLAIM is allowed only when TD is shutdown. > > + * state. i.e. destructing TD. > > + * TDH.PHYMEM.PAGE.RECLAIM requires TDR and target page. > > + * Because we're destructing TD, it's rare to contend with TDR. > > + */ > > + } while (err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX)); > > + if (WARN_ON_ONCE(err)) { > > + pr_tdx_error(TDH_PHYMEM_PAGE_RECLAIM, err, &out); > > + return -EIO; > > + } > > + > > + if (do_wb) { > > + /* > > + * Only TDR page gets into this path. No contention is expected > > + * because of the last page of TD. > > + */ > > + err = tdh_phymem_page_wbinvd(set_hkid_to_hpa(pa, hkid)); > > + if (WARN_ON_ONCE(err)) { > > + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); > > + return -EIO; > > + } > > + } > > + > > + tdx_clear_page(pa); > > + return 0; > > +}