On Tue, Apr 14, 2020 at 09:45:08PM -0700, Andrii Nakryiko wrote: > > > > > FD is closed, dumper program is detached and dumper is destroyed > > > (unless pinned in bpffs, just like with any other bpf_link. > > > 3. At this point bpf_dumper_link can be treated like a factory of > > > seq_files. We can add a new BPF_DUMPER_OPEN_FILE (all names are for > > > illustration purposes) command, that accepts dumper link FD and > > > returns a new seq_file FD, which can be read() normally (or, e.g., > > > cat'ed from shell). > > > > In this case, link_query may not be accurate if a bpf_dumper_link > > is created but no corresponding bpf_dumper_open_file. What we really > > need to iterate through all dumper seq_file FDs. > > If the goal is to iterate all the open seq_files (i.e., bpfdump active > sessions), then bpf_link is clearly not the right approach. But I > thought we are talking about iterating all the bpfdump programs > attachments, not **sessions**, in which case bpf_link is exactly the > right approach. That's an important point. What is the pinned /sys/kernel/bpfdump/tasks/foo ? Every time 'cat' opens it a new seq_file is created with new FD, right ? Reading of that file can take infinite amount of time, since 'cat' can be paused in the middle. I think we're dealing with several different kinds of objects here. 1. "template" of seq_file that is seen with 'ls' in /sys/kernel/bpfdump/ 2. given instance of seq_file after "template" was open 3. bpfdumper program 4. and now links. One bpf_link from seq_file template to bpf prog and many other bpf_links from actual seq_file kernel object to bpf prog. I think both kinds of links need to be iteratable via get_next_id. At the same time I don't think 1 and 2 are links. read-ing link FD should not trigger program execution. link is the connecting abstraction. It shouldn't be used to trigger anything. It's static. Otherwise read-ing cgroup-bpf link would need to trigger cgroup bpf prog too. FD that points to actual seq_file is the one that should be triggering iteration of kernel objects and corresponding execution of linked prog. That FD can be anon_inode returned from raw_tp_open (or something else) or FD from open("/sys/kernel/bpfdump/foo"). The more I think about all the objects involved the more it feels that the whole process should consist of three steps (instead of two). 1. load bpfdump prog 2. create seq_file-template in /sys/kernel/bpfdump/ (not sure which api should do that) 3. use bpf_link_create api to attach bpfdumper prog to that seq_file-template Then when the file is opened a new bpf_link is created for that reading session. At the same time both kinds of links (to teamplte and to seq_file) should be iteratable for observability reasons, but get_fd_from_id on them should probably be disallowed, since holding such FD to these special links by other process has odd semantics. Similarly for anon seq_file it should be three step process as well: 1. load bpfdump prog 2. create anon seq_file (api is tbd) that returns FD 3. use bpf_link_create to attach prog to seq_file FD May be it's all overkill. These are just my thoughts so far.