On 24.01.25 10:56, Michal Koutný wrote:
On Fri, Jan 17, 2025 at 08:02:55PM +0100, Friedrich Vock <friedrich.vock@xxxxxx> wrote:
Yeah, there are pools for the whole path between limit_pool and
test_pool, but the issue is that we traverse the entire tree of cgroups,
and we don't always stay on the path between limit_pool and test_pool
(because we're iterating from the top down, and we don't know what the
path is in that direction - so we just traverse the whole tree until we
find test_pool).
This means that we'll sometimes end up straying off-path - and there are
no guarantees for which pools are present in the cgroups we visit there.
These cgroups are the potentially problematic ones where the issue can
happen.
Ideally we could always stay on the path between limit_pool and
test_pool, but this is hardly possible because we can only follow parent
links (so bottom-up traversal) but for accurate protection calculation
we need to traverse the path top-down.
Aha, thanks for bearing with me.
css_foreach_descendant_pre(css, limit_pool->cs->css) {
dmemcg_iter = container_of(css, struct dmemcg_state, css);
struct dmem_cgroup_pool_state *found_pool = NULL;
list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
if (pool->region == limit_pool->region) {
found_pool = pool
break;
}
}
if (!found_pool)
continue;
page_counter_calculate_protection(
climit, &found->cnt, true);
}
Here I use (IMO) more idiomatic css_foreach_descendant_pre() instead and
I use the predicate based on ->region (correct?) to match pool's
devices.
Good catch with ->region! That works well indeed, I think it might not
have been a thing back when I wrote this tree traversal :>
I've applied this snippet (with a few minor edits) and taken it for a
test run too - it appears to work nicely in practice as well. I'll send
out a v2 with this approach tomorrow.
Thanks,
Friedrich
Would that work as intended?
Michal