Re: [PATCH 0/5] cxl: Initialization and shutdown fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/12/24 22:57, Dan Williams wrote:
Alejandro Lucero Palau wrote:
[..]
I am skeptical that PROBE_FORCE_SYNCRONOUS is a fix for any
device-readiness bug. Some other assumption is violated if that is
required.

But that problem is not about device readiness but just how the device
model works. In this case the memdev creation is adding devices, no real
ones but those abstractions we use from the device model, and that
device creation is done asynchronously.
Device creation is not done asynchronously, the PCI driver is attaching
asynchrounously. When the PCI driver attaches it creates memdevs and
those are attached to cxl_mem synchronously.

memdev, a Type2 driver in my case, is going to work with such a device
abstraction just after the memdev creation, it is not there yet.
Oh, is the concern that you always want to have the memdev attached to
cxl_mem immediately after it is registered?

I think that is another case where "MODULE_SOFTDEP("pre: cxl_mem")" is
needed. However, to fix this situation once and for all I think I would
rather just drop all this modularity and move both cxl_port and cxl_mem
to be drivers internal to cxl_core.ko similar to the cxl_region driver.


Oh, so the problem is the code is not ready because the functionality is in a module not loaded yet.

Then it makes sense that change. I'll do it if not already taken. I'll send v4 without the PROBE_FORCE_SYNCHRONOUS flag and without the previous loop with delays implemented in v3.

Thanks



It is true that clx_mem_probe will interact with the real device, but
the fact is such a function is not invoked by the device model
synchronously, so the code using such a device abstraction needs to
wait until it is there. With this flag the waiting is implicit to
device creation.  Without that flag other "nasty dancing" with delays
and checks needs to be done as the code in v3 did.
It is invoked synchronously *if* the driver is loaded *and* the user has
not forced asynchronous attach on the command line in which case they
get to keep the pieces.

For the type-2 case I did have an EPROBE_DEFER in my initial RFC on the
assumption that an accelerator driver might want to wait until CXL is
initialized before the base accelerator proceeds. However, if
accelerator drivers behave the same as the cxl_pci driver and are ok
with asynchronus arrival of CXL functionality then no deferral is
needed.

I think deferring the accel driver makes sense. In the sfc driver case,
it could work without CXL and then change to use it once the CXL kernel
support is fully initialised, but I guess other accel drivers will rely
on CXL with no other option, and even with the sfc driver, supporting
such a change will make the code far more complex.
Makes sense.

Otherwise, the only motivation for synchronous probing I can think of
would be to have more predictable naming of kernel objects. So yes, I
would be curious to understand what scenarios probe deferral is still
needed.
OK. I will keep that patch with the last change in the v4. Let's discuss
this further with that patch as a reference.
EPROBE_DEFER when CXL is not ready yet is fine in the sfc driver, just
comment that the driver does not have a PCIe-only operation mode and
requires that CXL initializes.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux