Re: PCI service interrupt handlers & access to PCI config space

Lukas Wunner <lukas@xxxxxxxxx> · Sat, 10 Apr 2021 18:26:22 +0200

On Sat, Apr 10, 2021 at 05:17:09PM +0200, Pali Rohár wrote:
> On Saturday 10 April 2021 16:25:24 Lukas Wunner wrote:
> > raw_spin_locks are not supposed to be held for a prolonged
> > period of time.
> 
> What is "prolonged period of time"? Because for example PCIe controller
> on A3720 has upper limit about 1.5s when accessing config space. This is
> what I measured in real example. It really took PCIe HW more than 1s to
> return error code if it happens.

Linux Device Drivers (2005) says:

   "The last important rule for spinlock usage is that spinlocks must
    always be held for the minimum time possible. The longer you hold
    a lock, the longer another processor may have to spin waiting for
    you to release it, and the chance of it having to spin at all is
    greater. Long lock hold times also keep the current processor from
    scheduling, meaning that a higher priority process -- which really
    should be able to get the CPU -- may have to wait. The kernel
    developers put a great deal of effort into reducing kernel latency
    (the time a process may have to wait to be scheduled) in the 2.5
    development series. A poorly written driver can wipe out all that
    progress just by holding a lock for too long. To avoid creating
    this sort of problem, make a point of keeping your lock-hold times
    short."

1.5 sec is definitely too long.  This sounds like a quirk of this
specific hardware.  Try to find out if the hardware can be configured
differently to respond quicker.

> > The interrupt is masked when it triggers and until it's been handled,
> > so the interrupt handler never runs multiple times concurrently.
> > See e.g. handle_level_irq() in kernel/irq/chip.c.
> 
> I'm thinking if it is really safe here. On dual-core system, there may
> be two tasks (on each CPU) which calls pcie_capability_read_word() (or
> other access function). Exactly one pass pci lock. And after that two
> different interrupts (for each CPU) may occur which handlers would like
> to call pcie_capability_read_word(). raw_spin_lock_irqsave (which is
> called from pci_lock_config()) disable interrupts on local CPU, right?
> But what with second CPU? Could not be there issue?

No.

> And what would happen if pcie_capability_read_word() is locked for 1.5s
> as described above? Does not look safe for me.

It's safe, but performs poorly.

Thanks,

Lukas