Sorry for the delay in my responses here I had some things get in my way. On Fri, 9 Aug 2024 09:13:52 Oliver O'Halloran <oohall@xxxxxxxxx> wrote: > Ok? If we have to check for DPC being enabled in addition to checking > the surprise bit in the slot capabilities then that's fine, we can do > that. The question to be answered here is: how should this feature > work on ports where it's normal for a device to be removed without any > notice? I'm not sure if its the correct thing to check however. I assumed that ports using the pciehp driver would usually consider it "normal" for a device to be removed actually, but maybe I have the idea of hp reversed. On Fri, 9 Aug 2024 14:34:04 Maciej W. Rozycki <macro@xxxxxxxxxxx> wrote: > Well, in principle in a setup with reliable links the LBMS bit may never > be set, e.g. this system of mine has been in 24/7 operation since the last > reboot 410 days ago and for the devices that support Link Active reporting > it shows: > ... > so out of 11 devices 6 have the LBMS bit clear. But then 5 have it set, > perhaps worryingly, so of course you're right, that it will get set in the > field, though it's not enough by itself for your problem to trigger. The way I look at it is that its essentially a probability distribution with time, but I try to avoid learning too much about the physical layer because I would find myself debugging more hardware issues lol. I also don't think LBMS/LABS being set by itself is very interesting without knowing the rate at which it is being set. FWIW I have seen some devices in the past going into recovery state many times a second & still never downtrain, but at the same time they were setting the LBMS/LABS bits which maybe not quite spec compliant. I would like to help test these changes, but I would like to avoid having to test each mentioned change individually. Does anyone have any preferences in how I batch the patches for testing? Would it be ok if I just pulled them all together on one go? - Matt