On 6/21/21 11:47 PM, rainkin wrote:
On 6/21/21 6:12 AM, rainkin wrote:
Hi,
My ebpf program is attched to kprobe/vfs_read, my use case is to store
information of each file (i.e., inode) of each process by using
map-in-map (e.g., outer map is a hash map where key is pid, value is a
inner map where key is inode, value is some stateful information I
want to store.
Thus I need to create a new inner map for a new coming inode.
I know there exists local storage for task/inode, however, limited to
my kernel version (4.1x), those local storage cannot be used.
I tried two methods:
1. dynamically create a new inner in user-land ebpf program by
following this tutorial:
https://github.com/torvalds/linux/blob/master/samples/bpf/test_map_in_map_user.c
Then insert the new inner map into the outer map.
The limitation of this method:
It requires ebpf kernel program send a message to user-land program to
create a newly inner map.
And ebpf kernel programs might access the map before user-land program
finishes the job.
2. Thus, i prefer the second method: dynamically create inner maps in
the kernel ebpf program.
According to the discussion in the following thread, it seems that it
can be done by calling bpf_map_update_elem():
https://lore.kernel.org/bpf/878sdlpv92.fsf@xxxxxxx/T/#e9bac624324ffd3efb0c9f600426306e3a40ec
7b5
Creating a new map for map_in_map from bpf prog can be implemented.
bpf_map_update_elem() is doing memory allocation for map elements. In such a case calling
this helper on map_in_map can, in theory, create a new inner map and insert it into the outer map.
However, when I call method to create a new inner, it return the error:
64: (bf) r2 = r10
65: (07) r2 += -144
66: (bf) r3 = r10
67: (07) r3 += -176
; bpf_map_update_elem(&outer, &ino, &new_inner, BPF_ANY);
68: (18) r1 = 0xffff8dfb7399e400
70: (b7) r4 = 0
71: (85) call bpf_map_update_elem#2
cannot pass map_type 13 into func bpf_map_update_elem#2
This is expected based on current verifier implementation.
In verifier check_map_func_compatibility() function, we have
case BPF_MAP_TYPE_ARRAY_OF_MAPS:
case BPF_MAP_TYPE_HASH_OF_MAPS:
if (func_id != BPF_FUNC_map_lookup_elem)
goto error;
break;
For array/hash map-in-map, the only supported helper
is bpf_map_lookup_elem(). bpf_map_update_elem()
is not supported yet.
Thanks for your answer!
If I understand correctly, the conclusion is that (at least for now)
*ebpf kernel program*
CAN only do lookup for array/hash map-in-map, and CANNOT do
add/update/delete for array/hash
map-in-map, and CANNOT create reguar hash/array maps dynamically.
Right.
For your method #1, the bpf helper bpf_send_signal() or
bpf_send_signal_thread() might help to send some info
to user space, but I think they are not available in
4.x kernels.
Maybe a single map with key (pid, inode) may work?
new_inner is a structure of inner hashmap.
Any suggestions?
Thanks,
Rainkin
a single map with key (pid, inode) is ok for the above scenario, however,
when I want to cleanup all entries realted to a certain pid when a
process exits,
a single map is NOT ok. I need to go through all the keys of the
single map and delete keys related
to the certain pid.
I understand this. Totally agree that it is expensive for the cleanup.
In such cases, map_in_map is the best strategy.
Alexei recently added a support to call bpf create_map/update_map
syscall in the bpf program ([1]). This needs to be a new program
type though.
In your particular case, you are doing kprobe/vfs_read which is
in the process context and in the beginning of syscall, it probably
safe to call create/update_map syscalls (I did not look at the
kernel codes thoroughly). But verifier needs to ensure it is
indeed safe. There are some ongoing compiler annotation work ([2]),
which may help annotate such functions so verifier can do
an effective work.
BTW, this is all future work. For now, esp. if you are using
4.1x kernels, I guess (pid, inode) probably your best shot.
[1]
https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@xxxxxxxxx/
[2] https://reviews.llvm.org/D103667