On 1/12/2024 9:23 PM, Thomas Gleixner wrote:
External email: Use caution opening links or attachments
On Thu, Jan 11 2024 at 10:58, Vidya Sagar wrote:
While calculating the hwirq number for an MSI interrupt, the higher
bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI domain
number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting
in same hwirq number for devices 0019:00:00.0 and 0039:00:00.0.
So, cast the PCI domain number to 'irq_hw_number_t' before left shifting
it to calculate hwirq number.
This still does not explain that this fixes it only on 64-bit platforms
and why we don't care for 32-bit systems.
Agree that this fixes the issue only on 64-bit platforms. It doesn't
change the behavior on 32-bit platforms. My understanding is that the
issue surfaces only if there are too many PCIe controllers in the system
which usually is the case in modern server systems and it is arguable if
the server systems really run 32-bit kernels.
One way to fix it for both 32-bit and 64-bit systems is by changing the
type of 'hwirq' to u64. This may cause two memory reads in 32-bit
systems whenever 'hwirq' is accessed and that may intern cause some perf
impact?? Is this the way you think I should be handling it?