On Thu, Apr 8, 2021 at 7:13 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > Hi Bjorn, thanks for taking a look. > > > On Thu, Apr 8, 2021 at 3:42 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > [+cc Greg, Rafael, Matthew: device model questions] > > > > Hi Dan, > > > > On Thu, Apr 01, 2021 at 07:31:20AM -0700, Dan Williams wrote: > > > Once the cxl_root is established then other ports in the hierarchy can > > > be attached. The cxl_port object, unlike cxl_root that is associated > > > with host bridges, is associated with PCIE Root Ports or PCIE Switch > > > Ports. Add cxl_port instances for all PCIE Root Ports in an ACPI0016 > > > host bridge. > > > > I'm not a device model expert, but I'm not sure about adding a new > > /sys/bus/cxl/devices hierarchy. I'm under the impression that CXL > > devices will be enumerated by the PCI core as PCIe devices. > > Yes, PCIe is involved, but mostly only for the CXL.io slow path > (configuration and provisioning via mailbox) when we're talking about > memory expander devices (CXL calls these Type-3). So-called "Type-3" > support is the primary driver of this infrastructure. > > You might be thinking of CXL accelerator devices that will look like > plain PCIe devices that happen to participate in the CPU cache > hierarchy (CXL calls these Type-1). There will also be accelerator > devices that want to share coherent memory with the system (CXL calls > these Type-2). > > The infrastructure being proposed here is primarily for the memory > expander (Type-3) device case where the PCI sysfs hierarchy is wholly > unsuited for modeling it. A single CXL memory region device may span > multiple endpoints, switches, and host bridges. It poses similar > stress to an OS device model as RAID where there is a driver for the > component contributors to an upper level device / driver that exposes > the RAID Volume (CXL memory region interleave set). The CXL memory > decode space (HDM: Host Managed Device Memory) is independent of the > PCIe MMIO BAR space. > > That's where the /sys/bus/cxl hierarchy is needed, to manage the HDM > space across the CXL topology in a way that is foreign to PCIE (HDM > Decoder hierarchy). > > > Doesn't > > that mean we will have one struct device in the pci_dev, and another > > one in the cxl_port? > > Yes, that is the proposal. > > > That seems like an issue to me. More below. > > hmm... > > > > > > The cxl_port instances for PCIE Switch Ports are not > > > included here as those are to be modeled as another service device > > > registered on the pcie_port_bus_type. > > > > I'm hesitant about the idea of adding more uses of pcie_port_bus_type. > > I really dislike portdrv because it makes a parallel hierarchy: > > > > /sys/bus/pci > > /sys/bus/pci_express > > > > for things that really should not be different. There's a struct > > device in pci_dev, and potentially several pcie_devices, each with > > another struct device. We make these pcie_device things for AER, DPC, > > hotplug, etc. E.g., > > > > /sys/bus/pci/devices/0000:00:1c.0 > > /sys/bus/pci_express/devices/0000:00:1c.0:pcie002 # AER > > /sys/bus/pci_express/devices/0000:00:1c.0:pcie010 # BW notification > > > > These are all the same PCI device. AER is a PCI capability. > > Bandwidth notification is just a feature of all Downstream Ports. I > > think it makes zero sense to have extra struct devices for them. From > > a device point of view (enumeration, power management, VM assignment), > > we can't manage them separately from the underlying PCI device. For > > example, we have three separate "power/" directories, but obviously > > there's only one point of control (00:1c.0): > > > > /sys/devices/pci0000:00/0000:00:1c.0/power/ > > /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/power/ > > /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie010/power/ > > The superfluous power/ issue can be cleaned up with > device_set_pm_not_required(). > > What are the other problems this poses, because in other areas this > ability to subdivide a device's functionality into sub-drivers is a > useful organization principle? So much so that several device writer > teams came together to create the auxiliary-bus for the purpose of > allowing sub-drivers to be carved off for independent functionality > similar to the portdrv organization. > Bjorn, any further thoughts on this? This port architecture question is in the critical path for the next phase of CXL development (targeting v5.14 not v5.13).