Re: [LSF/MM/BPF TOPIC] CXL Development Discussions

Dan Williams <dan.j.williams@xxxxxxxxx> · Mon, 6 May 2024 16:47:37 -0700

Adam Manzanares wrote:
> Hello all,
> 
> I would like to have a discussion with the CXL development community about
> current outstanding issues and also invite developers interested in RAS and
> memory tiering to participate.

Thanks for putting this together Adam!

> The first topic I believe we should discuss is how we can ensure as a group
> that we are prioritizing upstream work. On a recent upstream CXL development
> discussion call there was a call to review more work. I apologize for not
> grabbing the link, but I believe Dave Jiang is leveraging patchwork and this
> link should be shared with others so we can help get more reviews where needed.

Dave already replied here but one thing I will add is help keeping an
eye out for things that should be in queue. Likely a good way to
do that is send a note along with a review so both get reflected in the
tracking.

> The second topic I would like to discuss is how we integrate RAS features that
> have similar equivalents in the kernel. A CXL device can provide info about 
> memory media errors in a similar fashion to memory controllers that have EDAC
> support. Discussions have been put on the list and I would like to hear thoughts
> from the community about where this should go [1]. On the same topic CXL has 
> port level RAS features and the PCIe DW series touched on this issue  [2]

If I could uplevel this a bit there are multiple efforts in memory RAS
that likely want to figure out a cohesive story, or at least make
conscious decisions about implementation divergence. Some related work
that caught my eye:

* AMD M1300 specific poison handling that sounds similar to CXL List
  Poison facility:
  http://lore.kernel.org/r/20240214033516.1344948-3-yazen.ghannam@xxxxxxx

* Scrub subsystem that has both ACPI and CXL intercepts:
  http://lore.kernel.org/r/20240419164720.1765-1-shiju.jose@xxxxxxxxxx

* Inconsistencies between firmware reported fatal errors and native
  error handling, compare:

  ghes_proc()::
        if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
                __ghes_panic(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);

  ...vs:

  pcie_do_recovery()::
        /* TODO: Should kernel panic here? */
        pci_info(bridge, "device recovery failed\n");

  Also the inconsistencies between EXTLOG, GHES, BERT, and native error
  reporting.

> The third topic I would like to discuss is how we can get a set of common
> benchmarks for memory tiering evaluations. Our team has done some initial
> work in this space, but we want to hear more from end users about their 
> workloads of concern. There was a proposal related to this topic, but from what 
> I understand no meeting has been held [3]. 
> 
> The last topic that I believe is worth discussion is how do we come up with
> a baseline for testing. I am aware of 3 efforts that could be used cxl_test, 
> qemu, and uunit testing framework [4].

I think benchmarking for memory-tiering is orthogonal to patch
unit, function, and integration testing.

For testing I think it is an "all of the above plus hardware testing if
possible" situation. My hope is to get to a point where CXL patchwork
lights up "S/W/F" columns with backend tests similar to NETDEV
patchwork:

https://patchwork.kernel.org/project/netdevbpf/list/

There are some initial discussions about how to do this likely we can
grab some folks to discuss more.

I think Paul and Song would be useful to have for this discussion. Can
you recommend others that would be useful for this or other CXL
topics to help with timeslot conflict resolution?