Fan Ni wrote: > On Mon, Feb 13, 2023 at 01:31:17PM -0500, Gregory Price wrote: > > > On Mon, Feb 13, 2023 at 01:22:17PM -0500, Gregory Price wrote: > > > On Fri, Feb 10, 2023 at 01:05:21AM -0800, Dan Williams wrote: > > > > Changes since v1: [1] > > > > [... snip ...] > > > [... snip ...] > > > Really i see these decoders and device mappings setup: > > > port1 -> mem2 > > > port2 -> mem1 > > > port3 -> mem0 > > > > small correction: > > port1 -> mem1 > > port3 -> mem0 > > port2 -> mem2 > > > > > > > > Therefore I should expect > > > decoder0.0 -> mem2 > > > decoder0.1 -> mem1 > > > decoder0.2 -> mem0 > > > > > > > this end up mapping this way, which is still further jumbled. > > > > Something feels like there's an off-by-one > > > > Currently, the naming of memdevs can be out-of-order due to the > following two reasons, > 1. At kernel side, cxl port driver does async device probe, which can > change the memdev naming even within a single OS boot and among multiple > time of device enumeration. The pattern can be observed with following > steps in the guest, > loop(){ > a) modprobe cxl_xxx > b)cxl list --> you will see the memdev name changes (like mem0->mem1). > c) rmmod cxl_xxx > } > This behaviour can be avoided by using sync device probe by making the > following change > -------------------------------------------- > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 258004f34281..f3f90fad62b5 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -663,7 +663,7 @@ static struct pci_driver cxl_pci_driver = { > .probe = cxl_pci_probe, > .err_handler = &cxl_error_handlers, > .driver = { > - .probe_type = PROBE_PREFER_ASYNCHRONOUS, > + .probe_type = PROBE_FORCE_SYNCHRONOUS, > }, > }; > ------------------------------------------- > > The above patch, you will see consistent memdev naming within one > OS boot, however, the order can be still different from what we expect with > the qemu config options we use. We need to make some change at QEMU side > also as shown below. This is by design. Kernel device name order is not guaranteed even with synchronous probing and the async probing acts to make sure these names are always random for memdevs. For a memdev the recommendation is to identify them by 'host'/'path' or by 'serial': # cxl list -u -m 0000:35:00.0 { "memdev":"mem0", "pmem_size":"512.00 MiB (536.87 MB)", "serial":"0", "host":"0000:35:00.0" } # cxl list -u -s 0 { "memdev":"mem0", "pmem_size":"512.00 MiB (536.87 MB)", "serial":"0", "host":"0000:35:00.0" } Although, in real life a CXL device will have a non-zero unique serial number. > 2. Currently in Qemu, multiple components at the same topology level are > stored in a data structure called QLIST as defined in > include/qemu/queue.h. When enqueuing a component, current qemu code uses > QLIST_INSERT_HEAD to insert the item at the head, but when iterating, it > uses QLIST_FOREACH/QLIST_FOREACH_SAFE which is also from the head of the > list. That is to say, if we enqueue items P1,P2,P3 in order, when iterating, > we get P3,P2,P1. I have a simple test with the below code change(always > insert to the list tail), the order issue is fixed. Again, kernel does not and should not be expected to guarantee kernel device name ordering. Perhaps this merits /dev/cxl/by-path and /dev/cxl/by-id similar to /dev/disk/by-path and /dev/disk/by-id for semi-persistent / persistent naming. That's a conversation to have with the systemd-udev folks.