[Public] > -----Original Message----- > From: Bjorn Helgaas <helgaas@xxxxxxxxxx> > Sent: Wednesday, March 23, 2022 16:43 > To: Limonciello, Mario <Mario.Limonciello@xxxxxxx> > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; open list:PCI SUBSYSTEM <linux- > pci@xxxxxxxxxxxxxxx>; Linux PM <linux-pm@xxxxxxxxxxxxxxx>; Mehta, Sanju > <Sanju.Mehta@xxxxxxx>; Rafael J. Wysocki <rafael@xxxxxxxxxx> > Subject: Re: [PATCH v4] PCI / ACPI: Assume `HotPlugSupportInD3` only if device > can wake from D3 > > On Wed, Mar 23, 2022 at 09:26:15PM +0000, Limonciello, Mario wrote: > > [Public] > > > > > > > > On Wed, Mar 23, 2022 at 11:01:05AM -0500, Mario Limonciello wrote: > > > > On 3/15/22 11:36, Rafael J. Wysocki wrote: > > > > > On Tue, Mar 15, 2022 at 5:35 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> > > > wrote: > > > > > > > > > > > > On Tue, Mar 15, 2022 at 4:33 PM Mario Limonciello > > > > > > <mario.limonciello@xxxxxxx> wrote: > > > > > > > > > > > > > > According to the Microsoft spec the _DSD `HotPlugSupportInD3` is > > > > > > > indicates the ability for a bridge to be able to wakeup from D3: > > > > > > > > > > > > > > This ACPI object [HotPlugSupportInD3] enables the operating > system > > > > > > > to identify and power manage PCIe Root Ports that are capable of > > > > > > > handling hot plug events while in D3 state. > > > > > > > > > > > > > > This however is static information in the ACPI table at BIOS > compilation > > > > > > > time and on some platforms it's possible to configure the firmware at > > > boot > > > > > > > up such that _S0W returns "0" indicating the inability to wake up the > > > > > > > device from D3 as explained in the ACPI specification: > > > > > > > > > > > > > > 7.3.20 _S0W (S0 Device Wake State) > > > > > > > > > > > > > > This object evaluates to an integer that conveys to OSPM the > deepest > > > > > > > D-state supported by this device in the S0 system sleeping state > > > > > > > where the device can wake itself. > > > > > > > > > > > > > > This mismatch may lead to being unable to enumerate devices behind > the > > > > > > > hotplug bridge when a device is plugged in. To remedy these > situations > > > > > > > that `HotPlugSupportInD3` is specified by _S0W returns 0, explicitly > > > > > > > check that the ACPI companion has returned _S0W greater than or > equal > > > > > > > to 3 and the device has a GPE allowing the device to generate wakeup > > > > > > > signals handled by the platform in `acpi_pci_bridge_d3`. > > > > > > > > > > > > > > Windows 10 and Windows 11 both will prevent the bridge from going > in > > > D3 > > > > > > > when the firmware is configured this way and this changes aligns the > > > > > > > handling of the situation to be the same. > > > > > > > > > > > > > > Link: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org > %2F&data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3 > 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0 > %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata > =aAj91EajV77J7GjWEGGhAYLAb9Al2TMsH1baasdVdyI%3D&reserved=0 > > > > %2Fhtmlspecs%2FACPI_Spec_6_4_html%2F07_Power_and_Performance_Mgmt > > > %2Fdevice-power-management-objects.html%3Fhighlight%3Ds0w%23s0w- > s0- > > > device-wake- > > > > state&data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37a > > > > c47640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0 > > > > %7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw > > > > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata > > > > =Cw1OTYiX9BD3gh8eN3Zyz6%2FK8YFgqbn6bgi9%2FjNsnrM%3D&reserved > > > =0 > > > > > > > Link: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.mi > %2F&data=04%7C01%7CMario.Limonciello%40amd.com%7C095e9cb1cb3 > 34458a08d08da0d1610bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0 > %7C637836685684066439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata > =v8YYoxTCp1drb7ZxgbDkrgA2Nc3TxbENLBfPxbo1CTI%3D&reserved=0 > > > crosoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for- > pcie- > > > root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in- > > > > d3&data=04%7C01%7Cmario.limonciello%40amd.com%7C1f96c7aa37ac4 > > > > 7640d6208da0d0efd67%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0% > > > > 7C637836655281536690%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM > > > > DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata= > > > > UkB5lmz1QBUzWVM6%2BuNzJsleP%2Fi%2BDCJJuSgINiNbymo%3D&reserv > > > ed=0 > > > > > > > Fixes: 26ad34d510a87 ("PCI / ACPI: Whitelist D3 for more PCIe > hotplug > > > ports") > > > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > > > > > > > > > > > > No more comments from me: > > > > > > > > > > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > > > > > > > > Or please let me know if I should pick it up. > > > > > > > > Bjorn, > > > > > > > > Friendly reminder on this one. > > > > > > Thanks; we're in the middle of the merge window now, so I'll wait till > > > that's over unless this is an emergency. > > > > Actually it's a pretty important problem. We waffled on the nuts and > > bolts of this commit during the 5.17 RC's, but I didn't think we would > > make it out to the merge window before it got picked. I guess I should > > have spoke up on the urgency earlier. > > > > > IIUC the bug this fixes is that "when a bridge is in D3cold, we don't > > > notice when a device is hot-added below it," right? So we need to > > > avoid putting the bridge in D3cold? > > > > When the ASL has been configured this way (to return 0 for _S0W) the > > lower level hardware implementation will lead to hotplugged devices > > not being detected. > > Without this commit the hardware will expect to be in D0, but the kernel > > will select D3hot. So yes; the outcome is that hotplugged PCIe devices > > don't work. > > > > > Is there a typical scenario where this bites users? I don't think we > > > ever saw an actual problem report? > > > > This is the common way that these systems are being shipped. I have > > plenty of private reports related to this, but nothing public to link to. > > I still have no clue how many people this affects, what kernel > versions, etc. If there are no public reports, it suggests that it > doesn't affect distro kernels or production machines yet. > > Can we at least list some common scenarios? E.g., it affects kernels > after commit X, or it affects machines with CPUs newer than Y, or it > affects a certain kind of tunneling, etc? Distros need this > information so they can figure whether and how far to backport things > like this. It's going to affect any retail machine with the SOC we refer to in the kernel as "Yellow Carp". This is one of the first non-Intel USB4 hosts and will be using the USB4 SW CM in the kernel. Without this change, effectively PCIe tunneling will not work when any downstream PCIe device is hotplugged. In the right circumstances it might work if it's coldbooted (if the paths selected by the pre-boot firmware connection manager are identical to that selected by SW CM). Yellow carp isn't supported in the kernel until 5.14, so should this be backported it shouldn't need to go any further than 5.14. (Realistically though no distro should be on 5.14, they should have gone to the 5.15 LTS). Should it behave well baking in mainline, I would also later suggest to bring it back to 5.15.y and later as well for distros to pickup. > > > FYI: the earlier version of this was: > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchw > ork.kernel.org%2Fproject%2Flinux-usb%2Fpatch%2F1646658319-59532-1-git- > send-email- > Sanju.Mehta%40amd.com%2F&data=04%7C01%7CMario.Limonciello%40a > md.com%7C095e9cb1cb334458a08d08da0d1610bf%7C3dd8961fe4884e608e11 > a82d994e183d%7C0%7C0%7C637836685684066439%7CUnknown%7CTWFpbGZ > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0 > %3D%7C3000&sdata=tMSmfEl2%2FcvcFwNr7tg8QBi3s6HA8ZBgg5rCa7DZD > Hg%3D&reserved=0 > > This is basically to intentionally pull the device out of D3hot when a > > tunnel is created.