On Tue, Aug 06, 2024 at 07:38:27PM +0300, Ilpo Järvinen wrote: > On Thu, 1 Aug 2024, Bjorn Helgaas wrote: > > On Thu, Aug 01, 2024 at 08:05:23PM +0800, Jay Fang wrote: > > > On 2024/8/1 5:46, Bjorn Helgaas wrote: > > > > On Tue, Jul 30, 2024 at 05:57:43PM +0800, Ding Hui wrote: > > > >> On 2024/7/30 9:16, Jay Fang wrote: > > > >>> From 'commit 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal > > > >>> to avoid use-after-free")' we know that PCIe spec r6.0, sec 7.5.3.7, > > > >>> recommends that software program the same ASPM Control(pcie_link_state) > > > >>> value in all functions of multi-function devices, and free the > > > >>> pcie_link_state when any child function is removed. > > > >>> > > > >>> However, ASPM Control sysfs is still visible to other children even if it > > > >>> has been removed by any child function, and careless use it will > > > >>> trigger use-after-free error, e.g.: > > > >>> > > > >>> # lspci -tv > > > >>> -[0000:16]---00.0-[17]--+-00.0 Device 19e5:0222 > > > >>> \-00.1 Device 19e5:0222 > > > >>> # echo 1 > /sys/bus/pci/devices/0000:17:00.0/remove // pcie_link_state will be released > > > >>> # echo 1 > /sys/bus/pci/devices/0000:17:00.1/link/l1_aspm // will trigger error > > > >>> > > > >>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 > > > >>> Call trace: > > > >>> aspm_attr_store_common.constprop.0+0x10c/0x154 > > > >>> l1_aspm_store+0x24/0x30 > > > >>> dev_attr_store+0x20/0x34 > > > >>> sysfs_kf_write+0x4c/0x5c > > > >>> > > > >>> We can solve this problem by updating the ASPM Control sysfs of all > > > >>> children immediately after ASPM Control have been freed. > > > >>> > > > >>> Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") > > > >>> Signed-off-by: Jay Fang <f.fangjian@xxxxxxxxxx> > > > >>> --- > > > >>> drivers/pci/pcie/aspm.c | 2 ++ > > > >>> 1 file changed, 2 insertions(+) > > > >>> > > > >>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > >>> index cee2365e54b8..eee9e6739924 100644 > > > >>> --- a/drivers/pci/pcie/aspm.c > > > >>> +++ b/drivers/pci/pcie/aspm.c > > > >>> @@ -1262,6 +1262,8 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > > > >>> pcie_config_aspm_path(parent_link); > > > >>> } > > > >>> + pcie_aspm_update_sysfs_visibility(parent); > > > >>> + > > > >> > > > >> To be more rigorous, is there still a race window in > > > >> aspm_attr_{show,store}_common or clkpm_{show,store} before updating > > > >> the visibility, we can get an old or NULL pointer by > > > >> pcie_aspm_get_link()? > > > > > > > > Yeah, I think we still have a problem even with this patch. > > > > > > If so, maybe we need a new solution to completely sovle this problem. > > > > I think so. The pcie_link_state struct is kind of problematic to > > begin with. It basically encodes the PCI hierarchy again, even though > > the hierarchy is already completely described via struct pci_dev. > > > > IMO only the ASPM and clock PM state is really new information. I'm > > not convinced that we even need all of that (how can > > supported/enabled/capable/default/disabled *all* be useful and > > understandable?). But even if we *do* need all of that, it's only 39 > > bits of information per device. > > Hi all, > > To me, the most natural place for the link-related information such as > ASPM state would be inside struct pci_bus. Good point, that probably would make sense. > I actually did already take a look into migrating ASPM data there but the > way pcie_link_state is currently looked up through pci_dev (from both > ends of the link) seemed to make the conversion somewhat messy so I > postponed creating the patch for the migration. > > But it's certainly a change I'd like to see if somebody wants to look into > it. > > > > > For one thing, aspm_attr_store_common() captures the pointer from > > > > pcie_aspm_get_link() before the critical section, so by the time it > > > > *uses* the pointer, pcie_aspm_exit_link_state() may have freed the > > > > link state. > > > > > > > > And there are several other callers of pcie_aspm_get_link() that > > > > either call it before a critical section or don't have a critical > > > > section at all. > > > > > > > > I think it may be a mistake to alloc/free the link state separately > > > > from the pci_dev itself. > > > > > > > >>> mutex_unlock(&aspm_lock); > > > >>> up_read(&pci_bus_sem); > > > >>> } > > > >> > > > >> -- > > > >> Thanks, > > > >> - Ding Hui > > > >> > > > > > > > > . > > > > > > > > >