Re: [RFC bpf-next] Hierarchical Cgroup Stats Collection Using BPF

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 16 Mar 2022 09:13:00 -0700

On Tue, Mar 15, 2022 at 11:05 PM Song Liu <song@xxxxxxxxxx> wrote:
>
> On Wed, Mar 9, 2022 at 12:27 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> >
> [...]
> >
> > The map usage by BPF programs and integration with rstat can be as follows:
> > - Internally, each map entry has per-cpu arrays, a total array, and a
> > pending array. BPF programs and user space only see one array.
> > - The update interface is disabled. BPF programs use helpers to modify
> > elements. Internally, the modifications are made to per-cpu arrays,
> > and invoke a call to cgroup_bpf_updated()  or an equivalent.
> > - Lookups (from BPF programs or user space) invoke an rstat flush and
> > read from the total array.
>
> Lookups invoke a rstat flush, so we still walk every node of a subtree for
> each lookup, no? So the actual cost should be similar than walking the
> subtree with some BPF program? Did I miss something?
>

Hi Song,

Thanks for taking the time to read my proposal.

The rstat framework maintains a tree that contains only updated
cgroups. An rstat flush only traverses this tree, not the cgroup
subtree/hierarchy.

This also ensures that consecutive readers do not have to do any
traversals unless new updates happen, because the first reader will
have already flushed the stats.

>
> Thanks,
> Song
>
> > - In cgroup_rstat_flush_locked() flush BPF stats as well.
> >
> [...]