>> + * pcie_bus_update_set - update device mps when device doing hot-add >> + * @dev: PCI device to set >> + * >> + * After device hot add, mps will be set to default(128B), But the >> + * upstream port device's mps may be larger than 128B which was set >> + * by firmware during system bootup. Then we should update the device >> + * mps to equal to its parent mps, Or the device can not work normally. >> + */ >> +static void pcie_bus_update_set(struct pci_dev *dev) >> +{ >> + int mps, p_mps, mpss; >> + struct pci_dev *parent; >> + >> + if (!pci_is_pcie(dev) || !dev->bus->self >> + || !dev->bus->self->is_hotplug_bridge) > > Part of this looks redundant because pcie_bus_configure_set() already > checks pci_is_pcie(). And I don't know why we need to test > is_hotplug_bridge here; MPS settings need to be consistent regardless of > whether the upstream bridge supports hotplug. Hi Bjorn, I added is_hotplug_bridge() here is mainly to touch the hotplug case only. It was more like a temporary solution and not perfect one. > >> + return; >> + >> + parent = dev->bus->self; >> + mps = pcie_get_mps(dev); >> + p_mps = pcie_get_mps(parent); >> + >> + if (mps >= p_mps) >> + return; >> + >> + mpss = 128 << dev->pcie_mpss; >> + if (mpss < p_mps) { >> + dev_warn(&dev->dev, "MPSS %d smaller than upstream MPS %d\n" >> + "If necessary, use \"pci=pcie_bus_safe\" boot parameter to avoid this problem\n", >> + mpss, p_mps); >> + return; > > Since we can't configure the new device correctly, we really shouldn't > allow a driver to bind to it. The current design doesn't have much > provision for doing that, so warning is probably all we can do. Yes, bind a driver to the device which mps is not correctly set will cause another problem. > >> + } >> + >> + pcie_write_mps(dev, p_mps); >> + dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n", >> + pcie_get_mps(dev), 128 << dev->pcie_mpss, mps); >> +} >> + >> static void pcie_bus_detect_mps(struct pci_dev *dev) >> { >> struct pci_dev *bridge = dev->bus->self; >> @@ -1637,6 +1675,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data) >> return 0; >> >> if (pcie_bus_config == PCIE_BUS_TUNE_OFF) { >> + pcie_bus_update_set(dev); > > You're only adding this to the PCIE_BUS_TUNE_OFF path. Can't the same > problem occur for other pcie_bus_config settings? We only found the problem during PCIE_BUS_TUNE_OFF set. Other mode like PCIE_BUS_SAFE and PCIE_BUS_PEER2PEER. This issue won't happen. > >> pcie_bus_detect_mps(dev); >> return 0; >> } > > I have some long-term ideas here (below), but to make progress in the short > term, I think we just need to make sure this handles all pcie_bus_config > settings. > > Bjorn > > > > Stepping back a long ways, I think the current design is hard to use. > It's set up with the idea that we (1) enumerate all the devices in the > system, and then (2) configure MPS for everything all at once. > > That's not a very good fit when we start hotplugging devices, and it's > part of the reason MPS configuration is not well integrated into the PCI > core and doesn't get done at all for most architectures. Agree, arch code should not be involved the MPS setting. It's arch independent. > > What I'd prefer is something that could be done in the core as each device > is enumerated, e.g., in or near pci_device_add(). I know there's tension > between the need to do this before drivers bind to the device and the > desire to enumerate the whole hierarchy before committing to MPS settings. > But we need to handle that tension anyway for hot-added devices, so we > might as well deal with it at boot-time and use the same code path for > both boot-time and hot-add time. > > I have in mind something like this: > > pcie_configure_mps(struct pci_dev *dev) > { > int ret; > > if (!pci_is_pci(dev)) > return; > > if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { > /* set my MPS to dev->pcie_mpss (max supported size) */ > return; > } > > if (dev->pcie_mpss >= upstream bridge MPS) { > /* set my MPS to upstream bridge MPS */ > return; > } > > ret = pcie_set_hierarchy_mps(pcie_root_port(dev), dev->mpss); > if (ret == failure) > /* emit warning, can't enable this device */ If got failure here, should roll back ? What about set hierarchy mps in reverse order(down to top). > } > > struct pci_dev *pcie_root_port(struct pci_dev *dev) > { > if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) > return dev; > > return pcie_root_port(dev->bus->self); > } > > pcie_set_hierarchy_mps(struct pci_dev *root, int mpss) > { > struct pci_bus *secondary; > struct pci_dev *dev; > int ret; > > if (root->driver) > return -EINVAL; Maybe it's not safe enough, change device's mps has risk unless all its children devices have no driver bound(disabled). A root port may has no pcieport driver bound, if pcieport driver probe failed. But its children device can work normally. > > secondary = root->subordinate; > if (secondary) { > list_for_each_entry(dev, &secondary->devices, bus_list) { > ret = pcie_set_hierarchy(dev, mpss); > if (ret) > return ret; > } > } > > /* set my MPS to mpss */ > return 0; > } > > . > -- Thanks! Yijing -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html