Hi Gregory, thanks for your kind explanation. On Tue, 18 Mar 2025 11:13:13 -0400 Gregory Price <gourry@xxxxxxxxxx> wrote: > On Tue, Mar 18, 2025 at 08:02:46PM +0900, Honggyu Kim wrote: > > > > > > On 3/18/2025 5:02 PM, Yunjeong Mun wrote: > > > > Some simple corrections here. host-bridge{0-3} above aren't detected from CEDT. > > The corrected structure is as follows. > > > > rootport/ > > ├── socket0 > > │ ├── cross-host-bridge0 -> SRAT && CEDT (interleave on) --> NODE 2 > > │ │ ├── host-bridge0 > > │ │ │ ├── cxl0 -> CEDT > node 4 > > │ │ │ └── cxl1-> CEDT > node 5 > > │ │ └── host-bridge1 > > │ │ ├── cxl2 -> CEDT > node 6 > > │ │ └── cxl3 -> CEDT > node 7 > > │ └── dram0 -> SRAT ---------------------------------------> NODE 0 > > └── socket1 > > ├── cross-host-bridge1 -> SRAT && CEDT (interleave on)---> NODE 3 > > │ ├── host-bridge2 > > │ │ ├── cxl4 -> CEDT > node 8 > > │ │ └── cxl5 -> CEDT > node 9 > > │ └── host-bridge3 > > │ ├── cxl6 -> CEDT > node 10 > > │ └── cxl7 -> CEDT > node 11 > > └── dram1 -> SRAT ---------------------------------------> NODE 1 > > > > This is correct and expected. > > All of these nodes are "possible" depending on how the user decides to > program the CXL decoders and expose memory to the page allocator. > > In your /sys/bus/cxl/devices/ you should have something like > > decoder0.0 decoder0.1 decoder0.2 decoder0.3 > decoder0.4 decoder0.5 decoder0.6 decoder0.7 > decoder0.8 decoder0.9 > Yes, I can see many decoder#.# files in there, and their devtype values are shown below: $ cat /sys/bus/cxl/devices/decoder*/devtype cxl_decoder_root ... cxl_decoder_switch ... cxl_decoder_endpoint > These are the root decoders that should map up directly with each CEDT > CFMWS entry. > > 2 of them should have interleave settings. > > If you were to then program the endpoint and hostbridge decoders with > the matching non-interleave address values from the other CEDT entries, > you could bring each individual device online in its own NUMA node. > I think this means that I can program the endpoint(=cxl_decoder_endpoint) to map to the 8 CFMWS, and the hostbridge decoder (=cxl_decoder switch) to map to another 2 CFMWS(cross-host bridge). > Or, you can do what you're doing now, and program the endpoints to map > to the 2 cross-host bridge interleave root decoders. In my understanding, that kind of programming is done at the firmware or BIOS layer, right? > > So your platform is giving you the option of how to online your devices, > and as such it needs to mark nodes as "possible" even if they're unused. > Thank you for the clear explanation. I now understand why 'possible' has such value. > ~Gregory > Best regards, Yunjeong