Re: UAF during boot on MTL based devices with attached dock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 19, 2024 at 08:06:03AM +0000, Wassenberg, Dennis wrote:
> We want to boot up an Intel MeteorLake based system (e.g. Lenovo
> ThinkPad X13 Gen5) with the Lenovo Thunderbolt 4 universal dock
> attached during boot.
[...]
> 0000:00 [root bus]
>       -> 0000:00:07.0 [bridge to 20-49]
>                      -> 0000:20:00.0 [bridge to 21-49]
>                                     -> 0000:21:00.0 [bridge to 22]
>                                        0000:21:01.0 [bridge to 23-2e]
>                                        0000:21:02.0 [bridge to 2f-3a]
>                                        0000:21:03.0 [bridge to 3b-48]
>                                        0000:21:04.0 [bridge to 49]
>          0000:00:07.2 [bridge to 50-79]

The kernel oopses in kthread irq/156-pciehp.  That belongs to
the Root Port 0000:00:07.0, as is evident from...

    [   12.850063] pcieport 0000:00:07.0: PME: Signaling with IRQ 156

...because PME and hotplug share the same interrupt.

What happens here is that pciehp checks on probe whether the slot is
not occupied (neither link nor presence bits set in config space)
but is in ON_STATE (the list subordinate->devices is non-empty,
see pcie_init()).

pciehp then synthesizes a Presence Detect Changed event to bring the
slot down, i.e. de-enumerate the device in the purportedly non-occupied
slot:

pciehp_probe()
  pciehp_check_presence()
    pciehp_request(ctrl, PCI_EXP_SLTSTA_PDC)

Corresponding messages:

    [   12.850866] pcieport 0000:00:07.0: pciehp: pciehp_check_link_active: lnk_status = 5041

Bit 13 in the Link Status register is not set (Data Link Layer Link Active).

    [   12.850880] pcieport 0000:00:07.0: pciehp: Slot(12): Card not present

Synthesize Presence Detect Changed event

    [   12.850887] pcieport 0000:00:07.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:20:00

De-enumerate child device.

We need to find out why the oops occurs for sure, and it's good that
you found it and are able to reproduce it.  But the reason you're
seeing this on some devices and not on others is likely that the
Meteor Lake CPU oddly reports presence and link down even though
there's a device attached which is apparently accessible just fine.

Not sure if that's a hardware erratum in Meteor Lake or a BIOS issue.

I'll need some more time to root cause the oops.  Sorry for the delay,
everyone is still decompressing after Plumbers last week.

Thanks,

Lukas




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux