lspci reports 'stale' BAR info after a card reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm doing some experiments that generate errors on a PCIe link.
(I'd like to get AER reporting things - but that is a different
problem.)

If I force 'link down' (by shorting the TX lines with a screwdriver!)
the card side resets everything to do with the PCIe links including
all of config space - particularly the BARs.

The kernel doesn't know this has happened, so the bridge is left
configured and the device driver (hopefully) doesn't crash the
kernel when it gets 0xffffffff back from reads!

If I then run 'lspci -vx' I get:
09:00.0 Class 0004: Device 12d9:001e (rev 01) (prog-if 02)
        Subsystem: Device 12d9:0001
        Flags: fast devsel, IRQ 16, NUMA node 0
        [virtual] Memory at fa200000 (32-bit, non-prefetchable) [size=1M]
        [virtual] Memory at fa100000 (32-bit, non-prefetchable) [size=1M]
        [virtual] Memory at fa300000 (32-bit, non-prefetchable) [size=8K]
        ...
00: d9 12 1e 00 00 00 10 00 01 02 04 00 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 12 01 00
30: 00 00 00 00 50 00 00 00 00 00 00 00 00 01 00 00

Note that the hexdump shows the BARs as all zero, but the text shows the
values that the kernel thinks are being used.
This doesn't make it obvious that something has gone badly wrong.

AFAICT most of the info lspci outputs comes from decoding config space.
I suspect it is getting the BAR info from the kernel (linux 4.14.0-rc4 ish)
in order to print the size.
It would be better if it reported the inconsistency.

	David




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux