On Thu, Nov 18, 2021 at 11:33 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Nov 18, 2021 at 03:28:40PM -0500, Kenny Ho wrote: > > + for_each_possible_cpu(cpu) { > > + /* allocate first, connect the cgroup later */ > > + events[i] = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, NULL); > > This is a very heavy hammer for this task. > There is really no need for perf_event to be created. > Did you consider using raw_tp approach instead? I came across raw_tp but I don't have a good understanding of it yet. Initially I was hoping perf event/tracepoint is a stepping stone to raw tp but that doesn't seem to be the case (and unfortunately I picked the perf event/tracepoint route to dive in first because I saw cgroup usage.) Can you confirm if the following statements are true? - is raw_tp related to writable tracepoint - are perf_event/tracepoint/kprobe/uprobe and fentry/fexit/raw_tp considered two separate 'things' (even though both of their purpose is tracing)? > It doesn't need this heavy stuff. > Also I suspect in follow up you'd be adding tracepoints to GPU code? > Did you consider just leaving few __weak global functions in GPU code > and let bpf progs attach to them as fentry? There are already tracepoints in the GPU code. And I do like fentry way of doing things more but my head was very much focused on cgroup, and tracepoint/kprobe path seems to have something for it. I suspected this would be a bit too heavy after seeing the scalability discussion but I wasn't sure so I whip this up quickly to get some feedback (while learning more about perf/bpf/cgroup.) > I suspect the true hierarchical nature of bpf-cgroup framework isn't necessary. > The bpf program itself can filter for given cgroup. > We have bpf_current_task_under_cgroup() and friends. Is there a way to access cgroup local storage from a prog that is not attached to a bpf-cgroup? Although, I guess I can just store/read things using a map with the cg id as key. And with the bpf_get_current_ancestor_cgroup_id below I can just simulate the values being propagated if the hierarchy ends up being relevant. Then again, is there a way to atomically update multiple elements of a map? I am trying to figure out how to support a multi-user multi-app sharing use case (like user A given quota X and user B given quota Y with app 1 and 2 each having a quota assigned by A and app 8 and 9 each having quota assigned by B.) Is there some kind of 'lock' mechanism for me to keep quota 1,2,X in sync? (Same for 8,9,Y.) > I suggest to sprinkle __weak empty funcs in GPU and see what > you can do with it with fentry and bpf_current_task_under_cgroup. > There is also bpf_get_current_ancestor_cgroup_id().