On Friday 09 July 2021 19:03:37 Maximilian Luz wrote: > On 7/9/21 6:12 PM, Pali Rohár wrote: > > [...] > > > > > Hello! Now I'm thinking loudly about this patch. Why this kind of reset > > > > is needed only for Surface devices? AFAIK these 88W8897 chips are same > > > > in all cards. Chip itself implements PCIe interface (and also SDIO) so > > > > for me looks very strange if this 88W8897 PCIe device needs DMI specific > > > > quirks. I cannot believe that Microsoft got some special version of > > > > these chips from Marvell which are different than version uses on cards > > > > in mPCIe form factor. > > > > > > > > And now when I'm reading comment below about PCIe bridge to which is > > > > this 88W8897 PCIe chip connected, is not this rather an issue in that > > > > PCIe bridge (instead of mwifiex/88W8897) or in ACPI firmware which > > > > controls this bridge? > > > > > > > > Or are having other people same issues on mPCIe form factor wifi cards > > > > with 88W8897 chips and then this quirk should not DMI dependent? > > > > > > > > Note that I'm seeing issues with reset and other things also on chip > > > > 88W8997 when is connected to system via SDIO. These chips have both PCIe > > > > and SDIO buses, it just depends which pins are used. > > > > > > > > > > Hi and thanks for the quick reply! Honestly I've no idea, this is just the > > > first method we found that allows for a proper reset of the chip. What I > > > know is that some Surface devices need that ACPI DSM call (the one that was > > > done in the commit I dropped in this version of the patchset) to reset the > > > chip instead of this method. > > > > > > Afaik other devices with this chip don't need this resetting method, at > > > least Marvell employees couldn't reproduce the issues on their testing > > > devices. > > > > > > So would you suggest we just try to match for the pci chip 88W8897 instead? > > > > Hello! Such suggestion makes sense when we know that it is 88W8897 > > issue. But if you got information that issue cannot be reproduced on > > other 88W8897 cards then matching 88W8897 is not correct. > > > > From all this information looks like that it is problem in (Microsoft?) > > PCIe bridge to which is card connected. Otherwise I do not reason how it > > can be 88W8897 affected. Either it is reproducible on 88W8897 cards also > > in other devices or issue is not on 88W8897 card. > > I doubt that it's an issue with the PCIe bridge (itself at least). The > same type of bridge is used for both dGPU and NVME SSD on my device (see > lspci output below) and those work fine. Also if I'm seeing that right > it's from the Intel CPU, so my guess is that a lot more people would > have issues with that then. >From information below it seems to be related to surprise removal. Therefore is surprise removal working without issue for dGPU or NVME SSD? Not all PCIe bridges support surprise removal... > > I don't know about the hardware side, so it might be possible that it's > an issue with integrating both bridge and wifi chip, in which case it's > still probably best handled via DMI quirks unless we know more. > > Also as Tsuchiya mentioned in his original submission, on Windows the > device is reset via this D3cold method. I've only skimmed that > errata.inf file mentioned, but I think this is what he's referring to: > > Controls whether ACPIDeviceEnableD3ColdOnSurpriseRemoval rule will be > evaluated or not on a given platform. Currently > ACPIDeviceEnableD3ColdOnSurpriseRemoval rule only needs to be > evaluated on Surface platforms which contain the Marvell WiFi > controller which depends on device going through D3Cold as part of > surprise-removal. > > and > > Starting with Windows releases *after* Blue, ACPI will not put > surprise-removed devices into D3Cold automatically. Some known > scenarios (viz. WiFi reset/recovery) rely on the device cycling > through D3Cold on surprise-removal. This hack allows surprise-removed > devices to be put into D3Cold (if supported by the stack). > > So, as far as I can tell, the chip doesn't like to be surprise-removed > (which seems to happen during reset) and then needs to be power-cycled, > which I think is likely due to some issue with firmware state. Thanks for information. This really does not look like PCIe bridge specific if bridge itself can handle surprise-removed devices. lspci can tell us if bridge supports it or not (see below). > So the quirk on Windows seems very Surface specific. > > There also seem a bunch of revisions of these chips around, for example > my SB2 is affected by a bug that we've tied to the specific hardware > revision which causes some issues with host-sleep (IIRC chip switches > rapidly between wake and sleep states without any external influence, > which is not how it should behave and how it does behave on a later > hardware revision). Interesting... This looks like the issue can be in 88W8897 chip and needs some special conditions to trigger? And Surface is triggering it always? > > > Then we'd probably have to check if there are any laptops where multiple > > > devices are connected to the pci bridge as Amey suggested in a review > > > before. > > > > Well, I do not know... But if this is issue with PCIe bridge then > > similar issue could be observed also for other PCIe devices with this > > PCIe bridge. But question is if there are other laptops with this PCIe > > bridge. And also it can be a problem in ACPI firmware on those Surface > > devices, which implements some PCIe bridge functionality. So it is > > possible that issue is with PCIe bridge, not in HW, but in SW/firmware > > part which can be Microsoft specific... So too many questions to which > > we do not know answers. > > > > Could you provide output of 'lspci -nn -vv' and 'lspci -tvnn' on > > affected machines? If you have already sent it in some previous email, > > just send a link. At least I'm not able to find it right now and output > > may contain something useful... > > From my Surface Book 2 (with the same issue): > > - lspci -tvnn: https://paste.ubuntu.com/p/mm3YpcZJ8N/ > - lspci -vv -nn: https://paste.ubuntu.com/p/dctTDP738N/ Could you re-run lspci under root account? There are missing important parts like "Capabilities: <access denied>" where is information if bridge supports surprise removal or not. > Regards, > Max