Re: So, I had to revert d6d458d42e1 ("Handle DisplayPort tunnel activation asynchronously") too, to stop my resume crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 03, 2025 at 04:33:08AM -0800, Kenneth Crudup wrote:
> 
> OK, I may not be explaining the history properly, so more background:
> 
> (I tend to run Linus' master that I pull every few days, partially 'cause I
> like to see all the new fixes and features, and partially 'cause over the
> years I'll stumble over bugs and help the subsystems' Maintainer(s) fix the
> problems.)
> 
> Anyway, late last year I'd notice lately (it wasn't happening before) that
> once I'd get to the office, my laptop would be hard-hung on resume, which I
> eventually traced back to having my NVMe adaptor connected to my TB Dock
> when I suspended/hibernated. I'd started to try to bisect it, but couldn't
> find a good starting point (or one too far back) and would have to give up
> 'cause I'd run out of time. However, I'd mention the issue in the mailing
> lists, hoping for a solution- and that's when you'd discovered 9d573d19.
> 
> But between your NVMe discovery (and by this time I was mostly :( careful
> about disconnecting the NVMe adaptor before suspend) and sometime around the
> beginning of the year I was also getting occasional hard-hangs on resume
> even if I hadn't had the NVMe adaptor connected on suspend. I'd seen where
> the pstore dumps were pointing to the display driver, so I'd switched back
> to the i915 from the xe driver, but that hadn't fixed it either. In the
> meantime, having seen one of the OOPses be in __tb_path_deactivate_hop(),
> I'd dropped some printks (actually "tb_port_info()", I think) at various
> points printing the line# so I could try and tell approximately where the
> crash occurred (yeah, I know I need to get my ksymoops up and running :) ).
> I hadn't made the correlation yet between having an external monitor
> connected or not, and having seen a number of xe/i915/dp/Thunderbolt changes
> come thru, was both hoping for the fix to be reported and corrected, or try
> and find time and find out why it was happening via my tracing.
> 
> So in late February we'd had two failure modes for me in Linus' master:
> - 9d573d19 (NVMe adaptor connected on suspend causing an OOPS on resume)
> - d6d458d4 (OOPS if external USB-C DP monitor connected on resume)
> 
> I couldn't/didn't recognize the 2nd issue fully until you'd discovered the
> cause of the first one.
> 
> At home I have a Samsung Odyssey monitor connected to a USB-C-to-DP 2.1
> cable, to a TB port on a CalDigit TB4 dock.
> 
> My travel bag has a generic Chinese USB-C DP tunneling portable monitor
> which is usually connected to a Plugable TB hub.
> 
> In any case, the resume failures happen with either one.

Okay thanks for elaborating that.

> On 3/3/25 03:53, Mika Westerberg wrote:
> 
> > I thought the system resumes fine after you reverted the other commit
> > (9d573d19), no? Just you don't get display tunneled so for example if you
> > login over ethernet (ssh) you should still be able to get full dmesg.
> 
> Nah, it usually hard hangs if a monitor is connected when I resume; has to
> be power-cycled at that point.
> 
> > We can actually take PCIe out of the equation so that you ask "boltctl" to
> > forget the device temporarily (or from the GNOME settings "privacy and
> > security" -> "Thunderbolt" then "forget device" for each).  This means your
> > docks do not work fully but display should and then we hopefully can get
> > the dmesg.
> 
> Well my topology is almost always Laptop -> Dock -> Monitor .

Okay.

> This workflow came about ironically enough 'cause my client has given me a
> MS Surface (Windows) machine with only one TB/USB-C port, and since I will
> physically switch to using my own machine, to minimize setup changes I just
> use the "one cable for all" approach (i.e., never connecting the external
> monitor to the other TB port on my XPS-9320).
> 
> Oh and the failure mode for d6d458d4 is ALWAYS this, and always(?) from line
> 436/7 of ".../drivers/thunderbolt/path.c", a call to tb_port_write() :

That's also weird because we don't do anything for DP tunnels on resume so
what this code is doing is to clean up for the tunnels left by the boot
kernel (since this is hibernate). The code added by d6d458d4 is not run
yet, only later on when we get hotplugs from the connected device DP OUT
adapter. I will see if I can reproduce this on my setup, next.




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux