Re: [RFC bpf-next] Hierarchical Cgroup Stats Collection Using BPF

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 16 Mar 2022 09:11:07 -0700

On Tue, Mar 15, 2022 at 11:05 PM Song Liu <song@xxxxxxxxxx> wrote:
On Wed, Mar 9, 2022 at 12:27 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:

>

[...]

>

> The map usage by BPF programs and integration with rstat can be as follows:

> - Internally, each map entry has per-cpu arrays, a total array, and a

> pending array. BPF programs and user space only see one array.

> - The update interface is disabled. BPF programs use helpers to modify

> elements. Internally, the modifications are made to per-cpu arrays,

> and invoke a call to cgroup_bpf_updated()  or an equivalent.

> - Lookups (from BPF programs or user space) invoke an rstat flush and

> read from the total array.

Lookups invoke a rstat flush, so we still walk every node of a subtree for

each lookup, no? So the actual cost should be similar than walking the

subtree with some BPF program? Did I miss something?

Hi Song,

Thanks for taking the time to read my proposal.

The rstat framework maintains a tree that contains only updated cgroups. An rstat flush only traverses this tree, not the cgroup subtree/hierarchy.

This also ensures that consecutive readers do not have to do any traversals unless new updates happened, because the first reader will have already flushed the stats.

Thanks,

Song

> - In cgroup_rstat_flush_locked() flush BPF stats as well.

>

[...]