On Thu, Sep 30, 2021, at 12:29 PM, Thomas Gleixner wrote: > On Thu, Sep 30 2021 at 11:08, Andy Lutomirski wrote: >> On Tue, Sep 28, 2021, at 9:56 PM, Sohil Mehta wrote: >> I think we have three choices: >> >> Use a fancy wrapper around SENDUIPI. This is probably a bad idea. >> >> Treat the NV-2 as a real interrupt and honor affinity settings. This >> will be annoying and slow, I think, if it's even workable at all. > > We can make it a real interrupt in form of a per CPU interrupt, but > affinity settings are not really feasible because the affinity is in the > UPID.ndst field. So, yes we can target it to some CPU, but that's racy. > >> Handle this case with faults instead of interrupts. We could set a >> reserved bit in UPID so that SENDUIPI results in #GP, decode it, and >> process it. This puts the onus on the actual task causing trouble, >> which is nice, and it lets us find the UPID and target directly >> instead of walking all of them. I don't know how well it would play >> with hypothetical future hardware-initiated uintrs, though. > > I thought about that as well and dismissed it due to the hardware > initiated ones but thinking more about it, those need some translation > unit (e.g. irq remapping) anyway, so it might be doable to catch those > as well. So we could just ignore them for now and go for the #GP trick > and deal with the device initiated ones later when they come around :) Sounds good to me. In the long run, if Intel wants device initiated fancy interrupts to work well, they need a new design. > > But even with that we still need to keep track of the armed ones per CPU > so we can handle CPU hotunplug correctly. Sigh... I don’t think any real work is needed. We will only ever have armed UPIDs (with notification interrupts enabled) for running tasks, and hot-unplugged CPUs don’t have running tasks. We do need a way to drain pending IPIs before we offline a CPU, but that’s a separate problem and may be unsolvable for all I know. Is there a magic APIC operation to wait until all initiated IPIs targeting the local CPU arrive? I guess we can also just mask the notification vector so that it won’t crash us if we get a stale IPI after going offline. > > Thanks, > > tglx