On Tue, 2023-01-10 at 08:53 -0800, Dave Hansen wrote: > On 1/10/23 02:15, Huang, Kai wrote: > > On Fri, 2023-01-06 at 14:49 -0800, Dave Hansen wrote: > > > On 12/8/22 22:52, Kai Huang wrote: > ... > > > > + * Note: > > > > + * > > > > + * This function neither checks whether there's at least one online cpu > > > > + * for each package, nor explicitly prevents any cpu from going offline. > > > > + * If any package doesn't have any online cpu then the SEAMCALL won't be > > > > + * done on that package and the later step of TDX module initialization > > > > + * will fail. The caller needs to guarantee this. > > > > + */ > > > > > > *Does* the caller guarantee it? > > > > > > You're basically saying, "this code needs $FOO to work", but you're not > > > saying who *provides* $FOO. > > > > In short, KVM can do something to guarantee but won't 100% guarantee this. > > > > Specifically, KVM won't actively try to bring up cpu to guarantee this if > > there's any package has no online cpu at all (see the first lore link below). > > But KVM can _check_ whether this condition has been met before calling > > tdx_init() and speak out if not. At the meantime, if the condition is met, > > refuse to offline the last cpu for each package (or any cpu) during module > > initialization. > > > > And KVM needs similar handling anyway. The reason is not only configuring the > > global KeyID has such requirement, creating/destroying TD (which involves > > programming/reclaiming one TDX KeyID) also require at least one online cpu for > > each package. > > > > There were discussions around this on KVM how to handle. IIUC the solution is > > KVM will: > > 1) fail to create TD if any package has no online cpu. > > 2) refuse to offline the last cpu for each package when there's any _active_ TDX > > guest running. > > > > https://lore.kernel.org/lkml/20221102231911.3107438-1-seanjc@xxxxxxxxxx/T/#m1ff338686cfcb7ba691cd969acc17b32ff194073 > > https://lore.kernel.org/lkml/de6b69781a6ba1fe65535f48db2677eef3ec6a83.1667110240.git.isaku.yamahata@xxxxxxxxx/ > > > > Thus TDX module initialization in KVM can be handled in similar way. > > > > Btw, in v7 (which has per-lp init requirement on all cpus), tdx_init() does > > early check on whether all machine boot-time present cpu are online and simply > > returns error if condition is not met. Here the difference is we don't have any > > check but depend on SEAMCALL to fail. To me there's no fundamental difference. > > So, I'm going to call shenanigans here. > > You say: > > The caller needs to guarantee this. > > Then, you go and tell us how the *ONE* caller of this function doesn't > actually guarantee this. Plus, you *KNOW* this. > > Those are shenanigans. Agreed. > > Let's do something like this instead of asking for something impossible > and pretending that the callers are going to provide some fantasy solution. > > /* > * Attempt to configure the global KeyID on all physical packages. > * > * This requires running code on at least one CPU in each package. If a > * package has no online CPUs, that code will not run and TDX module > * initialization (TDH.whatever) will fail. > * > * This code takes no affirmative steps to online CPUs. Callers (aka. > * KVM) can ensure success by ensuring sufficient CPUs are online for > * this to succeed. > */ Thanks. Will update changelog accordingly. > > Now, since this _is_ all imperfect, what will our users see if this > house of cards falls down? Will they get a nice error message like: > > TDX: failed to configure module, no online CPUs in package 12 > > Or, will they see: > > TDX: Hurr, durr, I'm confused and you should be too > > ? I am expecting the former. I will work with Isaku to make sure of it.