Re: [PATCH 1/2] x86/hyperv: Expose an helper to map PCI interrupts

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Fri, 14 Apr 2023 09:28:52 +0200

Stanislav!

On Wed, Apr 12 2023 at 09:36, Stanislav Kinsburskii wrote:
> On Wed, Apr 12, 2023 at 09:19:51AM -0700, Stanislav Kinsburskii wrote:
>> > > +	affinity = irq_data_get_effective_affinity_mask(data);
>> > > +	cpu = cpumask_first_and(affinity, cpu_online_mask);
>> > 
>> > The effective affinity mask of MSI interrupts consists only of online
>> > CPUs, to be accurate: it has exactly one online CPU set.
>> > 
>> > But even if it would have only offline CPUs then the result would be:
>> > 
>> >     cpu = nr_cpu_ids
>> > 
>> > which is definitely invalid. While a disabled vector targeted to an
>> > offline CPU is not necessarily invalid.
>
> Although this patch only tosses the code and doens't make any functional
> changes, I guess if the fix for the used cpu id is required, it has to
> be in a separated patch.

Sure.

> Would you mind to elaborate more of the problem(s)?
> Do you mean that the result of cpumask_first_and has to be checked for not
> being >= nr_cpus_ids?
> Or do you mean that there is no need to check the irq affinity against
> cpu_online_mask at all and we can simply take any first bit from the
> effective affinity mask?

As of today the effective mask of MSI interrupts contains only online
CPUs. I don't see a reason for that to change.

> Also, could you elaborate more on the disabled vector targeting an
> offline CPU? Is there any use case for such scenario (in this case we
> might want to support it)?

I'm not aware of one today. That was more a theoretical reasoning.

> I guess the goal of this code is to make sure that hypervisor won't be
> configured to deliver an MSI to an offline CPU.

Correct, but if the interrupt _is_ masked at the MSI level then the
hypervisor must not deliver an interrupt at all.

The point is that it is valid to target a masked MSI entry to an offline
CPU under the assumption that the hardware/emulation respects the
masking. Whether that's a good idea or not is a different question.

The kernel as of today does not do that. It targets unused but
configured MSI[-x] entries towards MANAGED_IRQ_SHUTDOWN_VECTOR on CPU0
for various reasons, one of them being paranoia.

But in principle there is nothing wrong with that and it should either
succeed or being rejected at the software level and not expose a
completely invalid CPU number to the hypercall in the first place.

So if you want to be defensive, then keep the _and(), but then check the
result for being valid and emit something useful like a pr_warn_once()
instead of blindly handing the invalid result to the hypercall and then
have that reject it with some undecipherable error code.

Actually it would not necessarily reach the hypercall because before
that it dereferences cpumask_of(nr_cpu_ids) here:

	nr_bank = cpumask_to_vpset(&(intr_desc->target.vp_set),	cpumask_of(cpu));

and explode with a kernel pagefault. If not it will read some random
adjacent data and try to create a vp_set from it. Neither of that is
anywhere close to correct.

Thanks,

        tglx