Re: [PATCH v6 00/10] PCI: Fix unhandled interrupt on shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bjorn Helgaas <bhelgaas@xxxxxxxxxx> writes:

> On Sun, Apr 26, 2015 at 08:50:06AM +0200, Michael S. Tsirkin wrote:
>> On Fri, Apr 10, 2015 at 05:54:19PM -0500, Bjorn Helgaas wrote:
>> > Hi Michael,
>> > 
>> > I put your patches on my pci/msi branch and I hope to merge them for v4.1.
>> > I didn't apply the acks from Fam and Eric because I made changes to those
>> > patches that weren't completely trivial.  I think the end result is
>> > equivalent, though.  The diff attached to this cover letter is the
>> > difference between your v5 series and this v6 series.
>> > 
>> > As far as I'm concerned, this is ready to go except that I would like a
>> > little more info about the virtio kernel hang to include in the changelog
>> > for "PCI/MSI: Don't disable MSI/MSI-X at shutdown".
>> 
>> 
>> Hi Bjorn,
>> do you have eveything you need to merge this?
>
> No.  I made the minor changelog edits you suggested and the result is on
> my pci/msi-v7 branch.  But I still have these open issues:
>
>   - The last thing I heard from Eric was that "not disabling MSI/MSI-X at
>     shutdown is the wrong fix, and someone needs to fix a buggy driver."
>     I want to hear Eric say "OK, we need to leave MSI/MSI-X enabled at
>     shutdown for this case."

So far this just sounds like a device that needs a shutdown method.

>   - One changelog says "Stop disabling MSIs at shutdown to avoid
>     conflicting with drivers."  But I don't know what the conflict is.
>
>   - The bugzilla has no dmesg log or detailed analysis.  Fam said the
>     scenario I came up with
>     (http://lkml.kernel.org/r/20150416194245.GB20701@xxxxxxxxxx)
>     was fairly close, but it took me a lot of work to derive that.  Fixing
>     any errors in it and putting it in the bugzilla would be a big step.
>     The bugzilla should have the raw data and the analysis, so someone else
>     can validate the analysis and conclude that this patch is a reasonable
>     fix for it.  That's currently impossible because the bugzilla really
>     only contains the fix as a fait accompli.

What I saw in the bugzilla was:

An interrupt was stuck on, and being reasserted as quickly as we could
call iret for that interrupt.

We did not disable that interrupt because irq debugging was explicitly
disabled on the kernel command line.

The irq was asserted because the the device did not have a shutdown
method to stop the device doing things.

There is an argument that disabling bus mastering should have disabled
whatever the interrupting condition was (and thus there may also be a
bug in the qemu device emulation).

So I read this as the driver and maybe the "hardware" is buggy not that
the linux pci layer is buggy.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux