Re: [REGRESSION] resume with a Thunderbolt dock broke with commit e8b908146d44 "PCI/PM: Increase wait time after resume"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 25, 2023 at 10:42:55AM +0200, Kamil Paral wrote:
> On Thu, Aug 24, 2023 at 1:43 PM Mika Westerberg
> <mika.westerberg@xxxxxxxxxxxxxxx> wrote:
> > One thing I noticed, probably has nothing to do with this, but you have
> > the "security level" set to "secure". Now this is fine and actually
> > recommended but I wonder if anything changes if you switch that
> > temporarily to "user"? What is happening here is that once the system
> > enters S3 the Thunderbolt driver tells the firmware to save the
> > connected device list, and then once it exits S3 it is expected to
> > re-connect the PCIe tunnels of the devices on that list but this is not
> > happening and that's why the dock "dissappears" during resume.
> 
> That was a great suggestion. After switching to the user security
> level, the resume delay is gone, and my dock devices seem to be
> working almost immediately after resume! The dmesg for that is here:
> https://bugzilla-attachments.redhat.com/attachment.cgi?id=1985262
> 
> I've done tens of cycles and haven't found any race conditions, unlike
> with the TB assist mode. (Only once, my USB mouse wasn't working at
> all, but that's something that occasionally happens on most docks I've
> worked with and seems to be some different issue).
> 
> I'm sorry I haven't found this earlier myself. I did try switching
> these options, but I bundled it together with enabling the TB assist
> mode, which has quirks, so I didn't realize switching just this one
> option might have an impact.
> 
> > In any case we can conclude that the commit in question has nothing to
> > do with the issue. This is completely Thunderbolt related problem.
> 
> Considering the information above, does this appear to be a solely
> dock-related issue (bugged firmware), or does it make sense to follow
> up on this in some different kernel list? I have to say I'm completely
> OK with running the laptop using the "user" TB security level, but if
> you think I should follow up somewhere to get the "secure" level fixed
> (or some workaround applied, etc), I can.

I'm confused about this issue.  Correct me if I go wrong:

The hierarchy is:

  00:1c.4 Root Port to [bus 04-3c]
  04:00.0 Upstream Port (Thunderbolt) to [bus 05-3c]
  05:01.0 Downstream Port (Thunderbolt) to [bus 07-3b]
  07:00.0 Upstream Port (Thunderbolt) to [bus 08-3b]

With security level=secure, before e8b908146d44 ("PCI/PM: Increase
wait time after resume"), resume takes ~5 seconds, but the hierarchy
below 05:01.0 gets removed and re-enumerated (dmesg [1]).  After
e8b908146d44, the same thing happens except the resume takes 60+
seconds (dmesg [2]).  In both cases, the devices (USB mouse, LAN, etc)
below 05:01.0 work after resume.

With security level=user, resume takes << 5 seconds regardless of
e8b908146d44, and the hierarchy below 05:01.0 does not get removed and
re-enumerated (dmesg [3]).

So if that's all accurate, it sounds like we've always had some
problem with security level=secure that causes the hierarchy to get
removed and re-enumerated, and e8b908146d44 just makes this problem
much more visible?

I don't know anything at all about how Thunderbolt security levels
work.  If "secure" means the hierarchy must be re-enumerated after
resume, we can detect that case immediately and get on with it without
having to wait for a timeout?

Bjorn

[1] https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984726
[2] https://bugzilla-attachments.redhat.com/attachment.cgi?id=1984803
[3] https://bugzilla-attachments.redhat.com/attachment.cgi?id=1985262



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux