On Fri, 2024-09-06 at 09:37 +0900, Damien Le Moal wrote: > On 9/5/24 16:13, Philipp Stanner wrote: > > On Thu, 2024-09-05 at 09:33 +0900, Damien Le Moal wrote: > > > On 2024/09/05 6:10, Alex Williamson wrote: > > > > On Wed, 4 Sep 2024 23:24:53 +0300 > > > > Andy Shevchenko <andy.shevchenko@xxxxxxxxx> wrote: > > > > > > > > > Wed, Sep 04, 2024 at 12:07:21PM -0600, Alex Williamson > > > > > kirjoitti: > > > > > > On Wed, 04 Sep 2024 15:37:25 +0200 > > > > > > Philipp Stanner <pstanner@xxxxxxxxxx> wrote: > > > > > > > On Wed, 2024-09-04 at 17:25 +0900, Damien Le Moal wrote: > > > > > > > > > > ... > > > > > > > > > > > > If vfio-pci can get rid of pci_intx() alltogether, that > > > > > > > might > > > > > > > be a good > > > > > > > thing. As far as I understood Andy Shevchenko, pci_intx() > > > > > > > is > > > > > > > outdated. > > > > > > > There's only a hand full of users anyways. > > > > > > > > > > > > What's the alternative? > > > > > > > > > > From API perspective the pci_alloc_irq_vectors() & Co should > > > > > be > > > > > used. > > > > > > > > We can't replace a device level INTx control with a vector > > > > allocation > > > > function. > > > > > > > > > > vfio-pci has a potentially unique requirement > > > > > > here, we don't know how to handle the device interrupt, we > > > > > > only > > > > > > forward > > > > > > it to the userspace driver. As a level triggered > > > > > > interrupt, > > > > > > INTx will > > > > > > continue to assert until that userspace driver handles the > > > > > > device. > > > > > > That's obviously unacceptable from a host perspective, so > > > > > > INTx > > > > > > is > > > > > > masked at the device via pci_intx() where available, or at > > > > > > the > > > > > > interrupt controller otherwise. The API with the userspace > > > > > > driver > > > > > > requires that driver to unmask the interrupt, again > > > > > > resulting > > > > > > in a call > > > > > > to pci_intx() or unmasking the interrupt controller, in > > > > > > order > > > > > > to receive > > > > > > further interrupts from the device. Thanks, > > > > > > > > > > I briefly read the discussion and if I understand it > > > > > correctly > > > > > the problem here > > > > > is in the flow: when the above mentioned API is being called. > > > > > Hence it's design > > > > > (or architectural) level of issue and changing call from > > > > > foo() to > > > > > bar() won't > > > > > magically make problem go away. But I might be mistaken. > > > > > > > > Certainly from a vector allocation standpoint we can change to > > > > whatever > > > > is preferred, but the direct INTx manipulation functions are a > > > > different thing entirely and afaik there's nothing else that > > > > can > > > > replace them at a low level, nor can we just get rid of our > > > > calls > > > > to > > > > pci_intx(). Thanks, > > > > > > But can these calls be moved out of the spinlock context ? If > > > not, > > > then we need > > > to clarify that pci_intx() can be called from any context, which > > > will > > > require > > > changing to a GFP_ATOMIC for the resource allocation, even if the > > > use > > > case > > > cannot trigger the allocation. This is needed to ensure the > > > correctness of the > > > pci_intx() function use. > > > > We could do that I guess. As I keep saying, it's not intended to > > have > > pci_intx() allocate _permanently_. This is a temporary situation. > > pci_intx() should have neither devres nor allocation. > > > > > Frankly, I am surprised that the might sleep splat you > > > got was not already reported before (fuzzying, static analyzers > > > might > > > eventually > > > catch that though). > > > > It's a super rare situation: > > * pci_intx() has very few callers > > * It only allocates if pcim_enable_device() instead of > > pci_enable_device() ran. > > * It only allocates when it's called for the FIRST TIME > > * All of the above is only a problem while you hold a lock > > > > > > > > The other solution would be a version of pci_intx() that has a > > > gfp > > > flags > > > argument to allow callers to use the right gfp flags for the call > > > context. > > > > I don't think that's a good idea. As I said, I want to clean up all > > that in the mid term. > > > > As a matter of fact, there is already __pcim_intx() in pci/devres.c > > as > > a pure unmanaged pci_intx() as a means to split and then cleanup > > the > > APIs. > > Yeah. That naming is in fact confusing. __pcim_intx() should really > be named > pci_intx()... > > > One path towards getting the hybrid behavior out of pci_intx() > > could be > > to rename __pcim_intx() to pci_intx_unmanaged() and port everyone > > who > > uses pci_enable_device() + pci_intx() to that version. That would > > be > > better than to have a third version with a gfp_t argument. > > Sounds good. But ideally, all users that rely on the managed variant > should be > converted to use pcim_intx() and pci_intx() changed to not call in > devres. But > that may be too much code churn (I have not checked). Exactly that is the plan. I looked into that yesterday and my idea is to publish __pcim_intx() under the name pci_intx_unmanaged(), then port everyone who doesn't rely on managed behavior to that function and then port everyone who does rely on it to pcim_intx() (as you suggest). Afterwards you can remove pci_intx_unmanaged() again and also remove the hybrid behavior from pci_intx(). It's doable on not that much code. Getting it merged might be politically difficult, though. The only thing I'm really a bit afraid of is the pci_intx() user in pci/msi/ – MSI calls that through yet another implicit devres call, pcim_msi_release(). I *suspect* that pci_intx() in MSI can just be made pci_intx_unmanaged(), but that should be checked carefully. The doubly implicit devres magic caused some trouble in the past already, as we had discussed here [1]. P. [1] https://lore.kernel.org/all/ee44ea7ac760e73edad3f20b30b4d2fff66c1a85.camel@xxxxxxxxxx/