Re: [PATCH v3] s390/pci: fix CPU address in MSI for directed IRQ

Niklas Schnelle <schnelle@xxxxxxxxxxxxx> · Mon, 30 Nov 2020 10:50:02 +0100

On 11/30/20 9:55 AM, Halil Pasic wrote:
> On Mon, 30 Nov 2020 09:30:33 +0100
> Niklas Schnelle <schnelle@xxxxxxxxxxxxx> wrote:
> 
>> I'm not really familiar, with it but I think this is closely related
>> to what I asked Bernd Nerz. I fear that if CPUs go away we might already
>> be in trouble at the firmware/hardware/platform level because the CPU Address is
>> "programmed into the device" so to speak. Thus a directed interrupt from
>> a device may race with anything reordering/removing CPUs even if
>> CPU addresses of dead CPUs are not reused and the mapping is stable.
> 
> From your answer, I read that CPU hot-unplug is supported for LPAR. 

I'm not sure about hot hot-unplug and firmware
telling us about removed CPUs but at the very least there is:

echo 0 > /sys/devices/system/cpu/cpu6/online

>>
>> Furthermore our floating fallback path will try to send a SIGP
>> to the target CPU which clearly doesn't work when that is permanently
>> gone. Either way I think these issues are out of scope for this fix
>> so I will go ahead and merge this.
> 
> I agree, it makes on sense to delay this fix.
> 
> But if CPU hot-unplug is supported, I believe we should react when
> a CPU is unplugged, that is a target of directed interrupts. My guess
> is, that in this scenario transient hiccups are unavoidable, and thus
> should be accepted, but we should make sure that we recover.

I agree, I just tested the above command on a firmware test system and
deactivated 4 of 8 CPUs.
This is in /proc/interrupts after that:

...
  3:       9392          0          0          0   PCI-MSI  mlx5_async@pci:0001:00:00.0
  4:     282741          0          0          0   PCI-MSI  mlx5_comp0@pci:0001:00:00.0
  5:          0          2          0          0   PCI-MSI  mlx5_comp1@pci:0001:00:00.0
  6:          0          0        104          0   PCI-MSI  mlx5_comp2@pci:0001:00:00.0
  7:          0          0          0          2   PCI-MSI  mlx5_comp3@pci:0001:00:00.0
  8:          0          0          0          0   PCI-MSI  mlx5_comp4@pci:0001:00:00.0
  9:          0          0          0          0   PCI-MSI  mlx5_comp5@pci:0001:00:00.0
 10:          0          0          0          0   PCI-MSI  mlx5_comp6@pci:0001:00:00.0
 11:          0          0          0          0   PCI-MSI  mlx5_comp7@pci:0001:00:00.0
...

So it looks like we are left with registered interrupts
for CPUs which are offline. However I'm not sure how to
trigger a problem with that. I think the drivers would
usually only do a directed interrupt to a CPU that
is currently running the process that triggered the
I/O (I tested this assumption with "taskset -c 2 ping ...").
Now with the CPU offline there cannot be such a
process. So I think for the most part the queue would
just remain unused. Still, if we do get a directed
interrupt for it's my understanding that currently
we will lose that.

I think this could be fixed with
something I tried in a prototype code a while back that
is in zpci_handle_fallback_irq() I handled the IRQ locally.
Back then it looked like Directed IRQs would make it to z15 GA 1.5 and
this was done to help Bernd to debug a Millicode issue (Jup 905371).
I also had a version of that code meant as a possible performance
improvement that would check if the target CPU is
available and only then send the SIGP and otherwise handle it
locally.

> 
> Regards,
> Halil
>