Re: [PATCH 1/1] genirq/msi: Dynamic remove/add stroage adapter hits EEH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 20 2025 at 09:23, Thomas Gleixner wrote:
> On Wed, Mar 19 2025 at 21:58, Wen Xiong wrote:
>> We don't see the issue without dynamically remove/add operation.
>> There is a small window which irqbalance daemon kicks in during device
>> reset. So it took about over 6 hours to recreate the issue when doing
>> remove/add loop operation.
>
> Sure. You need a loop to hit the window. And it does not matter whether
> it's the probe or the remove which triggers it. Fact is that the reset
> wipes out the config space, which means that any read from the config
> space between reset and restore will return garbage. That problem is not
> restricted to the interrupt code. It's a general problem.

After looking at the code again, it's a problem in the remove()
function:

__ipr_remove()
  ipr_initiate_ioa_bringdown() 
    // resets device
    restore_config_space()
  ....
  ipr_free_all_resources()
    free_irqs()

So yes, it's not probe(). But the question is pretty much the same.

Why is a reset issued while the driver is fully operational and
resources are still in use?

Don't even think about telling me that this is a problem of the MSI
interrupt rework. It is not. It's been broken forever.

You _cannot_ pull the rung under a fully operational driver and expect
that all involved parts will just magically handle this gracefully.

What about tearing down resources first and then issuing the reset?

Thanks,

        tglx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux