On Tue, 2021-07-13 at 15:15 -0400, Alan Stern wrote: > On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote: > > Customers have been reporting that the I/O is radically being > > slowed down to HPE virtual USB ILO served DVD images during > > installation. > > > > Lots of investigation by the Red Hat lab has found that the issue > > is > > because MSI edge interrupts do not work properly for these > > ILO USB devices. > > We start fast and then drop to polling mode and its unusable. > > > > The issue exists currently upstream on 5.13 as tested by Red Hat, > > and reverting the mentioned patch corrects this upstream. > > > > David Jeffery has this explanation: > > > > The problem with the patch turning on MSI appears to be that the > > ehci > > driver (and possibly other usb controller types too) wasn't written > > to > > support edge-triggered interrupts. > > The ehci_irq routine appears to be written in such a way that it > > will > > be racy with multiple interrupt source bits. > > With a level-triggered interrupt, it gets called another time and > > cleans > > up other interrupt sources. > > But with MSI edge, the interrupt state staying high results in no > > new interrupt and ehci has to run based on polling. > > > > static irqreturn_t ehci_irq (struct usb_hcd *hcd) > > { > > ... > > status = ehci_readl(ehci, &ehci->regs->status); > > > > /* e.g. cardbus physical eject */ > > if (status == ~(u32) 0) { > > ehci_dbg (ehci, "device removed\n"); > > goto dead; > > } > > > > /* > > * We don't use STS_FLR, but some controllers don't like it > > to > > * remain on, so mask it out along with the other status > > bits. > > */ > > masked_status = status & (INTR_MASK | STS_FLR); > > > > /* Shared IRQ? */ > > if (!masked_status || unlikely(ehci->rh_state == > > EHCI_RH_HALTED)) { > > spin_unlock_irqrestore(&ehci->lock, flags); > > return IRQ_NONE; > > } > > > > /* clear (just) interrupts */ > > ehci_writel(ehci, masked_status, &ehci->regs->status); > > ... > > > > ehci_irq() reads the interrupt status register and then writes the > > active > > interrupt-related bits back out to ack the interrupt cause. > > But with an edge interrupt, this is racy as another source of > > interrupt > > could be raised by ehci between the read and the write reaching > > the > > hardware. > > e.g. If STS_IAA was set during the initial read, but some other > > bit like > > STS_INT gets raised by the hardware between the read and the write > > to the > > interrupt status register, the interrupt signal state won't drop. > > The interrupt state says high, and since it is now edged triggered > > with > > MSI, no new invocation of the interrupt handler gets triggered. > > Wouldn't it be better to change these other PCI drivers by adding > proper MSI support? I don't know what would be involved, but > presumably it wouldn't be very hard. (Just run the handler in a > loop > until all the interrupt status bits are off?) > > Alan Stern > Hello Agree with you that is a big hammer approach, but it's such a key piece of the massive number of HPE servers out there and we have many affected customers. While I did all the test work and discovery etc, I am definitely not a USB kernel guy very often, I spend most of my time in storage. I will listen for the other replies to see how the folks who know the subsystem better than I would want this reolved. Thanks Laurence