Re: [RFC 0/3] acpipcihp: fix kernel crash on 2nd resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Igor Mammedov wrote:
On Tue, 25 Jul 2023 09:51:53 -0400
Woody Suwalski <terraluna977@xxxxxxxxx> wrote:

Igor Mammedov wrote:
Changelog:
    * split out debug patch into a separate one with extra printk added
    * fixed inverte bus->self check (probably a reason why it didn't work before)


1/3 debug patch
2/3 offending patch
3/3 potential fix
I added more files to trace, add following to kernel CLI
     dyndbg="file drivers/pci/access.c +p; file drivers/pci/hotplug/acpiphp_glue.c +p; file drivers/pci/bus.c +p; file drivers/pci/pci.c +p; file drivers/pci/setup-bus.c +p; file drivers/acpi/bus.c +p" ignore_loglevel

should be applied on top of
     e8afd0d9fccc PCI: pciehp: Cancel bringup sequence if card is not present

apply a patch one by one and run testcase + capture dmesg after each patch
one shpould endup with 3 dmesg to ananlyse
   1st - old behaviour - no crash
   2nd - crash
   3rd - no crash hopefully

Igor Mammedov (3):
    acpiphp: extra debug hack
    PCI: acpiphp: Reassign resources on bridge if necessary
    acpipcihp: use __pci_bus_assign_resources() if bus doesn't have bridge

   drivers/pci/hotplug/acpiphp_glue.c | 23 ++++++++++++++++++-----
   1 file changed, 18 insertions(+), 5 deletions(-)
Actually applying patch1 is already creating the crash (why???),
probably it's due to an extra debug line, I've added.
I dropped suspicions one, can you try again and see if it works.

hence I
have added also dmesg-6.5-0.txt which shows a working condition based on
git e8afd0d9fccc level (acpiphp_glue in kernel 6.4)

Patch3 did not fix the issue, it seems that the culprit is somewhere
else triggered by  "benign" patch1 :-(

Also note about the trigger description in patch3: the dmesg trace on
Inspiron laptop is collected after the first wake from suspend to ram.
The consecutive  attempt to sleep results in a frozen system.
Thanks for clarification, I'll correct commit message once culprit
is found.

Good news. After removing the botched debug statement which was masking the original issue, the testing went as you have predicted, and on patch 3 system suspends to RAM OK.

Here are the requested 3 dmesg outputs, #2 is for the bad run.

I can retest with a final version of the patch once you have it ready...

Thanks, Woody

Attachment: rfc1.tar.xz
Description: application/xz


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux