On 4/15/20 1:27 PM, Yonghong Song wrote: > > As there are some discussions regarding to the kernel interface/steps to > create file/anonymous dumpers, I think it will be beneficial for > discussion with this work in progress. > > Motivation: > The current way to dump kernel data structures mostly: > 1. /proc system > 2. various specific tools like "ss" which requires kernel support. > 3. drgn > The dropback for the first two is that whenever you want to dump more, you > need change the kernel. For example, Martin wants to dump socket local If kernel support is needed for bpfdump of kernel data structures, you are not really solving the kernel support problem. i.e., to dump ipv4_route's you need to modify the relevant proc show function. > storage with "ss". Kernel change is needed for it to work ([1]). > This is also the direct motivation for this work. > > drgn ([2]) solves this proble nicely and no kernel change is not needed. > But since drgn is not able to verify the validity of a particular pointer value, > it might present the wrong results in rare cases. > > In this patch set, we introduce bpf based dumping. Initial kernel changes are > still needed, but a data structure change will not require kernel changes > any more. bpf program itself is used to adapt to new data structure > changes. This will give certain flexibility with guaranteed correctness. > > Here, kernel seq_ops is used to facilitate dumping, similar to current > /proc and many other lossless kernel dumping facilities. > > User Interfaces: > 1. A new mount file system, bpfdump at /sys/kernel/bpfdump is introduced. > Different from /sys/fs/bpf, this is a single user mount. Mount command > can be: > mount -t bpfdump bpfdump /sys/kernel/bpfdump > 2. Kernel bpf dumpable data structures are represented as directories > under /sys/kernel/bpfdump, e.g., > /sys/kernel/bpfdump/ipv6_route/ > /sys/kernel/bpfdump/netlink/ The names of bpfdump fs entries do not match actual data structure names - e.g., there is no ipv6_route struct. On the one hand that is a good thing since structure names can change, but that also means a mapping is needed between the dumper filesystem entries and what you get for context. Further, what is the expectation in terms of stable API for these fs entries? Entries in the context can change. Data structure names can change. Entries in the structs can change. All of that breaks the idea of stable programs that are compiled once and run for all future releases. When structs change, those programs will break - and structures will change. What does bpfdumper provide that you can not do with a tracepoint on a relevant function and then putting a program on the tracepoint? ie., why not just put a tracepoint in the relevant dump functions.