On Sun, Apr 14, 2019 at 09:36:41PM +0200, Lukas Wunner wrote: > I suppose this can happen if a write to the Slot Control register is > performed while HPIE and/or CCIE is disabled, the two notifications > are subsequently enabled and another write to the Slot Control is > performed. That second write will then call wait_event_timeout() > because of the stale ctrl->cmd_busy == 1, but the Command Complete > notification has already happened and was cleared by pcie_poll_cmd(), > hence it times out. > > Sounds reasonable, I'm a little suprised though that I've never seen > this myself. I guess we've been doing this wrong for years, so: On second thought, it's not surprising at all that I never saw this because Thunderbolt sets NoCompl+, so doesn't use Command Complete notifications. I suspect that even though we've been doing this wrong for a long time, the bug was exposed by a recent change to pciehp. Do you happen to know since which kernel version or commit you've been witnessing the timeouts? Thanks, Lukas