On Fri, Apr 14, 2023 at 09:28:52AM +0200, Thomas Gleixner wrote: > Stanislav! > > On Wed, Apr 12 2023 at 09:36, Stanislav Kinsburskii wrote: > > On Wed, Apr 12, 2023 at 09:19:51AM -0700, Stanislav Kinsburskii wrote: > >> > > + affinity = irq_data_get_effective_affinity_mask(data); > >> > > + cpu = cpumask_first_and(affinity, cpu_online_mask); > >> > > >> > The effective affinity mask of MSI interrupts consists only of online > >> > CPUs, to be accurate: it has exactly one online CPU set. > >> > > >> > But even if it would have only offline CPUs then the result would be: > >> > > >> > cpu = nr_cpu_ids > >> > > >> > which is definitely invalid. While a disabled vector targeted to an > >> > offline CPU is not necessarily invalid. > > > > Although this patch only tosses the code and doens't make any functional > > changes, I guess if the fix for the used cpu id is required, it has to > > be in a separated patch. > > Correct, but if the interrupt _is_ masked at the MSI level then the > hypervisor must not deliver an interrupt at all. > > The point is that it is valid to target a masked MSI entry to an offline > CPU under the assumption that the hardware/emulation respects the > masking. Whether that's a good idea or not is a different question. > > The kernel as of today does not do that. It targets unused but > configured MSI[-x] entries towards MANAGED_IRQ_SHUTDOWN_VECTOR on CPU0 > for various reasons, one of them being paranoia. > > But in principle there is nothing wrong with that and it should either > succeed or being rejected at the software level and not expose a > completely invalid CPU number to the hypercall in the first place. > > So if you want to be defensive, then keep the _and(), but then check the > result for being valid and emit something useful like a pr_warn_once() > instead of blindly handing the invalid result to the hypercall and then > have that reject it with some undecipherable error code. > > Actually it would not necessarily reach the hypercall because before > that it dereferences cpumask_of(nr_cpu_ids) here: > > nr_bank = cpumask_to_vpset(&(intr_desc->target.vp_set), cpumask_of(cpu)); > > and explode with a kernel pagefault. If not it will read some random > adjacent data and try to create a vp_set from it. Neither of that is > anywhere close to correct. > Thank you Thomas. I sent a patch to address the problmes you highlighted: "x86/hyperv: Fix IRQ effective cpu discovery for the interrupts unmasking" I'll update this series after that patch is merged. Thanks, Stanislav > Thanks, > > tglx