On 11/22/22 11:13, Peter Zijlstra wrote: > On Tue, Nov 22, 2022 at 07:14:14AM -0800, Dave Hansen wrote: >> On 11/22/22 01:13, Peter Zijlstra wrote: >>> On Mon, Nov 21, 2022 at 01:26:28PM +1300, Kai Huang wrote: >>>> +/* >>>> + * Call the SEAMCALL on all online CPUs concurrently. Caller to check >>>> + * @sc->err to determine whether any SEAMCALL failed on any cpu. >>>> + */ >>>> +static void seamcall_on_each_cpu(struct seamcall_ctx *sc) >>>> +{ >>>> + on_each_cpu(seamcall_smp_call_function, sc, true); >>>> +} >>> >>> Suppose the user has NOHZ_FULL configured, and is already running >>> userspace that will terminate on interrupt (this is desired feature for >>> NOHZ_FULL), guess how happy they'll be if someone, on another parition, >>> manages to tickle this TDX gunk? >> >> Yeah, they'll be none too happy. >> >> But, what do we do? > > Not intialize TDX on busy NOHZ_FULL cpus and hard-limit the cpumask of > all TDX using tasks. I don't think that works. As I mentioned to Thomas elsewhere, you don't just need to initialize TDX on the CPUs where it is used. Before the module will start working you need to initialize it on *all* the CPUs it knows about. The module itself has a little counter where it tracks this and will refuse to start being useful until it gets called thoroughly enough. >> There are technical solutions like detecting if NOHZ_FULL is in play and >> refusing to initialize TDX. There are also non-technical solutions like >> telling folks in the documentation that they better modprobe kvm early >> if they want to do TDX, or their NOHZ_FULL apps will pay. > > Surely modprobe kvm isn't the point where TDX gets loaded? Because > that's on boot for everybody due to all the auto-probing nonsense. > > I was expecting TDX to not get initialized until the first TDX using KVM > instance is created. Am I wrong? I went looking for it in this series to prove you wrong. I failed. :) tdx_enable() is buried in here somewhere: > https://lore.kernel.org/lkml/CAAhR5DFrwP+5K8MOxz5YK7jYShhaK4A+2h1Pi31U_9+Z+cz-0A@xxxxxxxxxxxxxx/T/ I don't have the patience to dig it out today, so I guess we'll have Kai tell us. >> We could also force the TDX module to be loaded early in boot before >> NOHZ_FULL is in play, but that would waste memory on TDX metadata even >> if TDX is never used. > > I'm thikning it makes sense to have a tdx={off,on-demand,force} toggle > anyway. Yep, that makes total sense. Kai had one in an earlier version but I made him throw it out because it wasn't *strictly* required and this set is fat enough. >> How do NOHZ_FULL folks deal with late microcode updates, for example? >> Those are roughly equally disruptive to all CPUs. > > I imagine they don't do that -- in fact I would recommend we make the > whole late loading thing mutually exclusive with nohz_full; can't have > both. So, if we just use schedule_on_cpu() for now and have the TDX code wait, will a NOHZ_FULL task just block the schedule_on_cpu() indefinitely? That doesn't seem like _horrible_ behavior to start off with for a minimal series.