1) Lower the DRAM bandwidth pressure by offloading some cachelines to CXL - reducing latency on DRAM and reducing average latency overall. The latency cost on CXL lines gets amortized over all DRAM fetches no longer hitting stalls. 2) Under full-pressure scenarios (DRAM and CXL are saturated), the additional lanes / buffers provide more concurrent fetches - i.e. you're just doing more work (and avoiding going to storage). This is the weaker of the two scenarios. No one is proposing we switch the default policy to weighted interleave. = Performance summary = (tests may have different configurations, see extended info below) 1) MLC (W2) : +38% over DRAM. +264% over default interleave. MLC (W5) : +40% over DRAM. +226% over default interleave. 2) Stream : -6% to +4% over DRAM, +430% over default interleave. 3) XSBench : +19% over DRAM. +47% over default interleave. ===================================================================== Performance tests - MLC