On Mon, 23 Aug 2021 12:03:08 +0100, Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Mon, Aug 23, 2021 at 10:30 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > On Sat, 21 Aug 2021 23:14:35 +0100, > > Barry Song <21cnbao@xxxxxxxxx> wrote: > > > > > > On Sat, Aug 21, 2021 at 10:42 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > Hi Bjorn, > > > > > > > > On Sat, 21 Aug 2021 00:33:28 +0100, > > > > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > > > > > > [...] > > > > > > > In msix_setup_entries(), we get nvecs msi_entry structs, and we > > > > > get a saved .default_irq in each one? > > > > > > > > That's a key point. > > > > > > > > Old-school PCI/MSI is represented by a single interrupt, and you > > > > *could* somehow make it relatively easy for drivers that only > > > > understand INTx to migrate to MSI if you replaced whatever is held in > > > > dev->irq (which should only represent the INTx mapping) with the MSI > > > > interrupt number. Which I guess is what the MSI code is doing. > > > > > > > > This is the 21st century, and nobody should ever rely on such horror, > > > > but I'm sure we do have such drivers in the tree. Boo. > > > > > > > > However, this *cannot* hold true for Multi-MSI, nor MSI-X, because > > > > there is a plurality of interrupts. Even worse, for MSI-X, there is > > > > zero guarantee that the allocated interrupts will be in a contiguous > > > > space. > > > > > > > > Given that, what is dev->irq good for? "Absolutely Nothing! (say it > > > > again!)". > > > > > > > > > > The only thing is that dev->irq is an sysfs ABI to userspace. Due to > > > the inconsistency between legacy PCI INTx, MSI, MSI-X, this ABI > > > should have been absolutely broken nowadays. This is actually what > > > the patchset was originally aiming at to fix. > > > > I do not think we should expose more of a broken abstraction to > > userspace. We will have to carry on exposing the first MSI in this > > field forever, but it doesn't mean we should have to do it for MSI-X. > > > > > One more question from me is that does dev->irq actually hold any > > > valid hardware INTx information while hardware is using MSI-X? At > > > least in my hardware, sysfs ABI for PCI is all "0". > > > > That's probably because nothing actually configured the interrupt, or > > that there is no INTx implementation. I have that on systems with > > pretty dodgy (or incomplete) firmware. > > > > > root@ubuntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# cat irq > > > 0 > > > > > > root@ubuntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# ls -l msi_irqs/* > > > -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/499 > > > -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/500 > > > -r--r--r-- 1 root root 4096 Aug 21 22:04 msi_irqs/501 > > > ... > > > root@ubuntu:/sys/devices/pci0000:7c/0000:7c:00.0/0000:7d:00.3# cat msi_irqs/499 > > > msix > > > > > > Not quite sure how it is going on different hardware platforms. > > > > My D05 does that as well, and it doesn't expose any INTx support. > > > > > > > > > MSI-X is not something you can "accidentally" use. You have to > > > > actively embrace it. In all honesty, this patch tries to move in the > > > > wrong direction. If anything, we should kill this hack altogether and > > > > fix the (handful of?) drivers that rely on it. That'd actually be a > > > > good way to find whether they are still worth keeping in the tree. And > > > > if it breaks too many of them, then at least we'll know where we > > > > stand. > > > > > > > > I'd be tempted to leave the below patch simmer in -next for a few > > > > weeks and see if how many people shout: > > > > > > This looks like a more proper direction to go. > > > but here i am wondering how sysfs ABI document should follow the below change > > > doc is patch 2/2: > > > https://lore.kernel.org/lkml/20210820223744.8439-3-21cnbao@xxxxxxxxx/ > > > > > > On the other hand, my feeling is that nobody should depend on sysfs > > > irq entry nowadays. > > > > Too late. It is there, and we need to preserve it. I just don't think > > feeding it more erroneous information is the right thing to do. > > > > My patch was only dealing with the kernel side of things, not the > > userspace ABI. That ABI should be carried on unchanged. > > it seems this isn't true. your patch is also changing userspace ABI > as long as you change pci_dev->irq which will be shown in sysfs irq > entry. I guess I wasn't clear enough above. Let me rephrase this: My patch was only dealing with the kernel side of things, not the userspace ABI. That ABI should be carried on unchanged, which requires additional changes in the sysfs code. > if we don't want to change the behaviour of any existing ABI, it > seems the only thing we can do here to document it well in ABI > doc. i actually doubt anyone has really understood what the irq > entry is really showing. Given that we can't prove that it is actually the case, I believe this is the only option. Thanks, M. -- Without deviation from the norm, progress is not possible.