Hi Tony, On 7/19/2023 5:20 PM, Tony Luck wrote: > Here's a quick hack to see how things might look with > separate domain lists in the "L3" resource. > > For testing purposes on a non-SNC system I set ->mon_scope = > MON_SCOPE_NODE, but made domain_add_cpu() allocate the mondomains > list based on L3 scope ... just so I could check that I found all > the places where monitoring needs to use the mondomains list. > The kernel doesn't crash when running tools/testing/selftests/resctrl, > and the tests all pass. But that doesn't mean I didn't miss something. > > Some restructuring of control vs. monitoing initialization might > avoid some of the code I duplicated in domain_add_cpu(). But this > is intended just as a "Is this what you meant?" before I dig deeper. Thank you for considering the approach. I find that this sample move towards the idea while also highlighting what else can be considered. I do not know if you already considered these ideas and found it flawed so I will try to make it explicit so that you can point out to me where things will fall apart. The sample code introduces a new list "mondomains" that is intended to be used when the monitoring scope is different from the allocation scope. This introduces duplication when the monitoring and allocation scope is different. Each list, "domains" and "mondomains" will host structures that can accommodate both monitoring and allocation data, with the data not relevant to the list going unused as it is unnecessarily duplicated. Additionally this forces significant portions of resctrl to now always consider whether the monitoring and allocation scope is different ... note how this sample now has code like below scattered throughout. h = r->mon_scope ? &r->mondomains : &r->domains; I also find the domain_add_cpu() becoming intricate as it needs to navigate all the different scenarios. This unnecessary duplication, new environment checks needed throughout, and additional code complexities are red flags to me that this solution is not well integrated. To deal with these complexities I would like to consider if it may make things simpler to always (irrespective of allocation and monitoring scope) maintain allocation and monitoring domain lists. Each list need only carry data appropriate to its use ... the allocation list only has data relevant to allocation, the monitoring list only has data relevant to monitoring. This is the struct rdt_domain related split I mentioned previously. Code could become something like: resctrl_online_cpu() { ... for_each_alloc_capable_rdt_resource(r) alloc_domain_add_cpu(...) for_each_mon_capable_rdt_resource(r) mon_domain_add_cpu(...) ... } This would reduce complication in domain_add_cpu() since each domain list only need to concern itself with monitoring or allocation. Even resctrl_online_domain() can be siplified significantly by making it specific to allocation or monitoring. For example, resctrl_online_mon_domain() would only and always just run the monitoring related code. With the separate allocation and monitoring domain lists there may no longer be a need for scattering code with checks like: h = r->mon_scope ? &r->mondomains : &r->domains; This would be because the code can directly pick the domain list it is operating on. What do you think? The above is just refactoring of existing code and from what I can tell this would make supporting SNC straight forward. Reinette