Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Huang Ying wrote:
> On Sun, 2013-03-31 at 17:04 +0200, Martin Mokrejs wrote:
>> Hi Ying,
>>   
>> Huang Ying wrote:
>>> Hi, Martin,
>>>
>>> Thanks for your testing!
>>>
>>> On Sun, 2013-03-31 at 12:35 +0200, Martin Mokrejs wrote:
>>>> Hi Ying,
>>>>   I have tested 4x your last patch. Somehow nothing gets logged to "dmesg"
>>>> when I hotremove or hotinsert the coldbooted eSATA card. Logging works so
>>>> enabling wifi via Fn+F2 is being logged. Also, eventual stacktraces
>>>> and kmemleaks.
>>>>   I removed the coldbooted card, inserted it and ejected it.
>>>>
>>>>
>>>>   In brief, lspci reports changes but there are no changes in /proc/interrupts
>>>> related to
>>>>
>>>>   19:          0          0   IO-APIC-fasteoi   sata_sil24
>>>>
>>>>
>>>> and no changes at all in /proc/iomem which I expected to happen during
>>>> hotremoval and hotinsert (something broken in 3.9-rc1 with your patch).
>>>>
>>>> All the runtime_status data were same after every tested step, so again,
>>>> no diffs to show but here are the values confirming laptop-mode-tools
>>>> enabled powersaving:
>>>>
>>>> /sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
>>>> /sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
>>>> /sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
>>>> /sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:suspended
>>>> /sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
>>>> /sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
>>>> /sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
>>>> /sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
>>>> /sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
>>>> /sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
>>>
>>> It appears that 1c.7 is identified successfully as an hotplug-able PCIe
>>> port, and never put into suspended state.
>>
>> Yes. Truly said, after I now went to test your previous two patches
>> on the 3.9-rc1 I confirm that the syslog logging is broken with all your
>> three patches. I fear we are hitting here, with the pciehp problems
>> not a powersaving issue but an upstream /proc or /sys files being outdated.
>> Otherwise I can't figure out why disabling in runtime laptop-mode-tools
>> and doing the "find /sys .... | while ... echo "on" > $f" trickery
>> does not help to get pciehp working. This would have fixed the acpiphp
>> at least on 3.8 kernel. I see that sata_sil24 is not loaded by itself
>> during hotinsert. It seems lspci reports at such times 0xff for the 11:00
>> eSATA card, /etc/iomem reports stale memory regions used by 11:00 while
>> /proc/interrupts says no IRQ is assigned to sata_sil24 (well, sata_sil24
>> is not loaded per lsmod, lspci would should report sata_sil24 also but
>> provided the 11:00 entry is broken and shows the 0xff it maybe cannot
>> report is sata_sil24 is loaded).
>>
>> I will post a little more details as a proper answer to your other patch
>> where I managed to get yet another stacktrace, about the eSATA thought to
>> be D3 state. Physically the card was ejected and just a modprobe sata_sil24
>> caused the sata_sil24 to use some outdated data. I will dive now into
>> that. 
>>
>>
>>
>>>
>>> And from your description below, it appears that hot-add and hot-remove
>>> of the eSATA card works for you, doesn't it?
>>
>> The PresDet works fine I think, yes. Sometimes I see in the lspci -vvv diffs:
>>
>> -Control: I/O+ ... BusMaster+
>> +Control: I/O- ... BusMaster-
> 
> But after hot-insert, can you use your eSATA card?  It appears that it
> is detected properly.

Can't say about the above two. But under pciehp what is broken is the hotremoval.
I think the rest is just a downstream consequence.

> 
>> and sometimes 
>>
>> -        Latency: 0, Cache Line Size: 64 bytes
>> +        Latency: 0

It seems to me that bridges in lspci output have 'Latency: 0' while end devices have
the Cache Line Size as well.

When the card is hot inserted after a previous hot removal and seems "dead" then
lspci says:
Control: I/O- Mem- BusMaster-
Interrupt: pin A routed to IRQ 19
and no 'Latency:' and no 'Cache Line Size:' are the output of the 11:00  device.

But please realize this is likely screwed because a previous eject of the card did not
fully release resources. When the slot was empty lspci reported 0xff and when it is
loaded it likely reports some crap. Unless the bug causing 'stale' data to be reported
(the 'Re: 3.8.2: stale pci device info for a previously inserted express card' thread)
I wonder what can we trust in this output.

>>
>> or even the Latency: line being gone completely from lspci -vvv output. Why is that?
>> I think debug checks and prints in kernel are necessary.
>>
>>
>> How do these related to /proc/interrupts not showing an IRQ for the 11:00 device?
>> Does that prevent automated sata_sil24 loading once the card is inserted? Would
>> you please add some extra debug prints and checks into the kernel?
>>
>> Take also into consideration the "3.8.2: stale pci device info for a previously inserted express card"
>> for a list of chimeric entries reported by lspci. That could tell you which values
>> are being cached and invalid. Hopefully some checks could be done between values
>> read by lspci and those in /proc and /sys.
>>
>>
>>
>> Do you already know why almost nothing is logged by kernel wen either of your
>> three patches (v1 sent on 03/29/13 08:41, v2 sent on 03/29/13 09:20, v3 sent on
>> 03/30/13 11:54)?
> 
> No.  Don't know why.  unpatched upstream kernel can produce kernel log?

OK, vanilla 3.9-rc1 also prints nothing to syslog relevant to hotplug (only pciehp
tested). Logging itself works, as I said, rmmod sata_sil24 is logged. So, sorry,
your patches did NOT break logging.

Martin


> 
> Best Regards,
> Huang Ying
> 
>> I did not test the xHCI port behavior with any of your three patches because I have
>> disabled USB support in this kernel altogether for 3.9-rc1 tests. And I would like to stick
>> with that until we fix the pciehp issue. I stepped rather late into the big testing game,
>> I believe the pciehp bug we are facing was not working since 3.5/3.6, I don't think
>> the 3.9-rc1-based tests be much different from earlier kernels.
>>
>> For a broader view, on the 3.8 series we will meanwhile hopefully get to a fix of the
>> PME# stuff. I think I reported quite a good number of potential problems yesterday.
>> After that, I will check how xHCI behaves on 3.9 but I believe the PME#-related fix from
>> 3.8 will be also applicable to fixing 3.9 so the xHCI won't have problems there anymore.
>>
>>
>> Martin
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux