On 11/27/20 4:39 PM, Halil Pasic wrote: > On Fri, 27 Nov 2020 11:08:10 +0100 > Niklas Schnelle <schnelle@xxxxxxxxxxxxx> wrote: > >> >> >> On 11/27/20 9:56 AM, Halil Pasic wrote: >>> On Thu, 26 Nov 2020 18:00:37 +0100 >>> Alexander Gordeev <agordeev@xxxxxxxxxxxxx> wrote: >>> >>>> The directed MSIs are delivered to CPUs whose address is >>>> written to the MSI message data. The current code assumes >>>> that a CPU logical number (as it is seen by the kernel) >>>> is also that CPU address. >>>> >>>> The above assumption is not correct, as the CPU address >>>> is rather the value returned by STAP instruction. That >>>> value does not necessarily match the kernel logical CPU >>>> number. >>>> >>>> Fixes: e979ce7bced2 ("s390/pci: provide support for CPU directed interrupts") >>>> Signed-off-by: Alexander Gordeev <agordeev@xxxxxxxxxxxxx> >>>> --- >>>> arch/s390/pci/pci_irq.c | 14 +++++++++++--- >>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c >>>> index 743f257cf2cb..75217fb63d7b 100644 >>>> --- a/arch/s390/pci/pci_irq.c >>>> +++ b/arch/s390/pci/pci_irq.c >>>> @@ -103,9 +103,10 @@ static int zpci_set_irq_affinity(struct irq_data *data, const struct cpumask *de >>>> { >>>> struct msi_desc *entry = irq_get_msi_desc(data->irq); >>>> struct msi_msg msg = entry->msg; >>>> + int cpu_addr = smp_cpu_get_cpu_address(cpumask_first(dest)); >>>> >>>> msg.address_lo &= 0xff0000ff; >>>> - msg.address_lo |= (cpumask_first(dest) << 8); >>>> + msg.address_lo |= (cpu_addr << 8); >>>> pci_write_msi_msg(data->irq, &msg); >>>> >>>> return IRQ_SET_MASK_OK; >>>> @@ -238,6 +239,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) >>>> unsigned long bit; >>>> struct msi_desc *msi; >>>> struct msi_msg msg; >>>> + int cpu_addr; >>>> int rc, irq; >>>> >>>> zdev->aisb = -1UL; >>>> @@ -287,9 +289,15 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) >>>> handle_percpu_irq); >>>> msg.data = hwirq - bit; >>>> if (irq_delivery == DIRECTED) { >>>> + if (msi->affinity) >>>> + cpu = cpumask_first(&msi->affinity->mask); >>>> + else >>>> + cpu = 0; >>>> + cpu_addr = smp_cpu_get_cpu_address(cpu); >>>> + >>> >>> I thin style wise, I would prefer keeping the ternary operator instead >>> of rewriting it as an if-then-else, i.e.: >>> cpu_addr = smp_cpu_get_cpu_address(msi->affinity ? >>> cpumask_first(&msi->affinity->mask) : 0); >>> but either way: >>> >>> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx> >> >> Thanks for your review, lets keep the if/else its certainly not less >> readable even if it may be less pretty. >> >> Found another thing (not directly in the touched code) but I'm now >> wondering about. In zpci_handle_cpu_local_irq() >> we do >> struct airq_iv *dibv = zpci_ibv[smp_processor_id()]; >> >> does that also need to use some _address() variant? If it does that >> then dicatates that the CPU addresses must start at 0. >> > > I didn't go to the bottom of this, but my understanding is that it > does not need a _address() variant. What we need is, probably, the > mapping between the 'id' and 'address' being a stable one. > > Please notice that cpu_enable_directed_irq() is called on each cpu. That > establishes the mapping/relationship between the id and the address, > as the machine cares for the address, and cpu_enable_directed_irq() > cares for the id: > static void __init cpu_enable_directed_irq(void *unused) > { > union zpci_sic_iib iib = {{0}}; > > iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector; > > __zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib); > zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC); > } Thanks for the very clear and understandable explanation, I think you're exactly right. I didn't look very closely and missed that cpu_enable_directed_irq() uses the smp_processor_id() thereby establishing the mapping. > > Now were the id <-> address mapping to change, we would be in trouble. If > that's possible, I don't know. My guess is that it would require cpu hot > unplug. Niklas, are you familiar with that stuff? Should we ask, Heiko > and Vasily? > > Regards, > Halil I'm not really familiar, with it but I think this is closely related to what I asked Bernd Nerz. I fear that if CPUs go away we might already be in trouble at the firmware/hardware/platform level because the CPU Address is "programmed into the device" so to speak. Thus a directed interrupt from a device may race with anything reordering/removing CPUs even if CPU addresses of dead CPUs are not reused and the mapping is stable. Furthermore our floating fallback path will try to send a SIGP to the target CPU which clearly doesn't work when that is permanently gone. Either way I think these issues are out of scope for this fix so I will go ahead and merge this. > >>> >>>> msg.address_lo = zdev->msi_addr & 0xff0000ff; >>>> - msg.address_lo |= msi->affinity ? >>>> - (cpumask_first(&msi->affinity->mask) << 8) : 0; >>>> + msg.address_lo |= (cpu_addr << 8); >>>> + >>>> for_each_possible_cpu(cpu) { >>>> airq_iv_set_data(zpci_ibv[cpu], hwirq, irq); >>>> } >>> >